THE ENJOYMEN 
OF MATHE 


wn 
ae ans Otto “Cocplits 


This book leads the reader 

7 | step by step to the heart of 

| a selection of intriguing 
mathematical problems, while 
requiring no more mathematical 
background than most people. 
acquire in high school | 


$4.50 


THE ENJOYMENT 
OF MATHEMATICS 


Do the prime numbers go on forever? 


How many routes must a streetcar company 
operate in order to serve all parts of a town with- 
out having more than one route on any section? 


Are there more whole numbers than even 
numbers? 


What curves can be drawn in a single stroke, 
without retracing any line or lifting the pencil 
from the paper? 


How many colors are needed to color a map? 


What is the triangle of shortest perimeter 
that can be inscribed in a given triangle? 


What is the smallest circle that will enclose 
a finite set of points? 


Requiring no more mathematical background 
than most people acquire in high school (plane 
geometry and elementary algebra), this book 
introduces the reader to some of the funda- 
mental ideas of mathematics, the ideas that 
make mathematics exciting and interesting. 

When first published under the title of Von 
Zahlen und Figuren in 1930, it was soon recog- 
nized as a model of exposition of subtle mathe- 
matical ideas for both students and laymen. 
Several new chapters have been added to this 
edition, which has been translated by Herbert 
Zuckerman. All the material from the second 
German edition has been retained. 

A reviewer in The Mathematics Teacher 
called this “a most delightful and intriguing 
book,” and in Science the reviewer said, “This is 
an excellent and welcome translation. . . . The 
English title is very appropriate for the book is 
indeed a thoroughly enjoyable sampler of fasci- 
nating mathematical problems and their solu- 
tions... . The book certainly achieves what the 
authors sought to accomplish.” 

Excerpts from other reviews that have ap- 


peared may be found on the inside back flap. 
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Preface 


Otto Toeplitz, co-author of this book, died in Jerusalem on Fe- 
bruary 19, 1940, after having left Germany in the Spring of 1939. 
Toeplitz began his academic career in Gottingen as a disciple of 
David Hilbert, was then professor in Kiel and later in Bonn. His 
scientific work is centered around the theory of integral equations 
and the theory of functions of infinitely many variables, fields to 
which he has made lasting contributions. 

The plan for this book arose at frequent meetings which the 
authors had, while Toeplitz was in Kiel and I was at the University 
of Hamburg. Both of us had repeatedly lectured about the subject 
matter of this book to a wider public. The manuscript was rewritten 
several times under mutual criticism. Toeplitz’s great interest in the 
history of mathematics is still visible in the present edition. I re- 
member with warm feelings the summer days in 1929 at Bonn, when 
together we put the last touches to the German manuscript. I am 
sure that Toeplitz would have been pleased and proud of the present 
English edition, a project of which he had often thought. 

I wish to thank the translator Professor Herbert S. Zuckerman for 
his painstaking and understanding work. Not only do I admire his 
apt translation of the German text, but I also think that in his pre- 
sentation of the content he has brought many of its arguments closer 
to the English speaking reader. He has added two chapters (15 and 
28) to the book, which faithfully reflect the spirit in which this book 
was written. 

My thanks go to my friends Emil Grosswald, D. H. Lehmer, and 
Herbert Robbins for help and valuable advice, and also to the 
publisher for his sympathetic cooperation. 


Philadelphia, 1956 Hans RADEMACHER 
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The Enjoyment of Mathematics 


Introduction 


Mathematics, because of its language and notation and its odd- 
looking special symbols, is closed off from the surrounding world as 
by a high wall. What goes on behind that wall is, for the most part, 
a secret to the layman. He thinks of dull uninspiring numbers, of 
a lifeless mechanism which functions according to laws of inescapable 
necessity. On the other hand, that wall very often limits the view 
of him who stays within. He is prone to measure all mathematical 
things with a special yardstick and he prides himself that nothing 
profane shall enter his realm. 

Is it possible to breach this wall, to present mathematics in such 
a way that the spectator may enjoy it? Cannot the enjoyment of 
mathematics be extended beyond the small circle of those who are 
“mathematically gifted’? Indeed, only a few are mathematically 
gifted in the sense that they are endowed with the talent to discover 
new mathematical facts. But by the same token, only very few are 
musically gifted in that they are able to compose music. Never- 
theless there are many who can understand and perhaps reproduce 
music, or who at least enjoy it. We believe that the number of 
people who can understand simple mathematical ideas is not 
relatively smaller than the number of those who are commonly 
called musical, and that their interest will be stimulated if only we 
can eliminate the aversion toward mathematics that so many have 
acquired from childhood experiences. 

It is the aim of these pages to show that the aversion toward 
mathematics vanishes if only truly mathematical, essential ideas are 
presented. This book ‘is intended to give samples of the diversified 
phenomena which comprise mathematics, of mathematics for its 
own sake, and of the zntrinsic values which it possesses. 

The attempt to present mathematics to nonmathematicians has 
often been made, but this has usually been done by emphasizing the 
usefulness of mathematics in other fields of human endeavor in an 
effort to secure the comprehension and interest of the reader. 
Frequently the advantages which it offers in technological and other 
applications have been described and these advantages have been 
illustrated by numerous examples. On the other hand, many books 
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have been written on mathematical games and pastimes. Although 
these books contain much interesting material, they give at best a 
very distorted picture of what mathematics really is. Finally, other 
books have discussed the foundations of mathematics with regard 
to their general philosophical validity. A reader of the following 
pages who is primarily interested in the pure, the absolute mathema- 
tics will naturally direct his attention toward just such an epistemolo- 
gical evaluation of mathematics. But this seems to us to be attaching 
an extraneous value to mathematics, to be judging its value according 
to measures outside itself. 

In the following pages we will not be able to demonstrate the 
effects of the ideas to be presented on the domain of mathematics 
itself. We cannot consider the interior applications, so to speak, 
of mathematics, the use of the ideas and results of one field in other 
fields of mathematics. This means that we must omit something 
that is quite essentialin the nature of the mathematical edifice, the 
great and surprising cross-connections that permeate this edifice in 
all directions. This omission is quite involuntary on our part, for 
the greatest mathematical discoveries are those which have revealed 
just such far-reaching interrelations. In order to present these inter- 
relations, however, we would need long and comprehensive prepara- 
tions and would have to assume a thorough training on the part of 
the reader. This is not our intention here. 

In other words, our presentation will emphasize not the facts as 
other sciences can disclose them to the outsider but the types of 
phenomena, the method of proposing problems, and the method of solving 
problems. Indeed, in order to understand the great mathematical 
events, the comprehensive theories, long schooling and persistent 
application would be required. But this is also true with music. 
On going to a concert for the first time one is not able to appreciate 
fully Bach’s ““Phe Art of Fugue,” nor can one immediately visualize 
the structure of a symphony. But besides the great works of music 
there are the smaller pieces which have something of true sublimity 
and whose spirit reveals itself to everyone. We plan to select such 
“smaller pieces” from the huge realm of mathematics: a sequence 
of subjects each one complete in itself, none requiring more than an 
hour to read and understand. The subjects are independent, so 
that one need not remember what has gone before when reading 
any chapter. Also, the reader is not required to remember what 
he may have been compelled to learn in his younger years. No 
use is made of logarithms or trigonometry. No mention is made of 
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differential or integral calculus. The theorems of geometrical 
congruence and the multiplication of algebraic sums will gradually 
be brought back to the memory of the reader; that will be all. 
In the case of a small work of music it may not only be the line 
_ of melody with which it opens that makes it beautiful. A little 
variation of the theme, a surprising modulation may well be the 
climax of the whole. Only he who has listened attentively to the 
basic theme will fully perceive and understand this climax. In a 
similar sense our reader will have to “listen” readily and attentively 
to the basic motive of a problem, to its development, to the first 
few examples which illustrate each theme, before the decisive 
modulation to the cardinal thought takes place. He will have to 
follow the reasoning with a little more active attention than is 
usually required in reading. If he does this, he will find no dif- 
ficulty in grasping the essential ideas of each subject. He will 
then get a glimpse of what a few great thinkers have created when 
they have occasionally left the realm of their comprehensive theoretical 
production and have built, from simple beginnings, a small self- 
contained piece of art, a fragment of the prototype of mathematics. 


1. The Sequence of Prime Numbers 


6 is equal to 2 times 3, but 7 cannot be similarly written as a 
product of factors. Therefore 7 is called a prime or prime number. 
A prime is a positive whole number which cannot be written as the 
product of two smaller factors. 5 and 3 are primes but 4 and 12 
are not since we have 4 = 2:2 and 12 = 3:4. Numbers which 
can be factored like 4 and 12 are called composite. The number 1 
is not composite but, because it behaves so differently from other 
numbers, it is not usually considered a prime either; consequently 
2 is the first prime, and the first few primes are: 


os ee te 10 83, 29. Si. 87, - --. 


A glance reveals that this sequence does not follow any simple law 
and, in fact, the structure of the sequence of primes turns out to 
be extremely complicated. 

A number can be factored a step at a time until it is reduced to 
a product of primes. Thus 6 = 2-3 is immediately expressed as a 
product of two primes, while 30= 5-6 and 6= 2-3 gives 
30 = 2-3-5, a product of three primes. Similarly, 24 has four 
prime factors (24 = 3-8 = 3-2-4 = 3-2-2: 2), of which three 
happen to be the same prime, 2. In the case of a prime such as 
5 one can only write 5 = 5, a product of a single prime. By means 
of this step-by-step factoring, any positive whole number except 1 
can be written as a product of primes. Because of this, the prime 
numbers can be thought of as the building blocks of the sequence 
of all positive whole numbers. 

In the ninth book of Euclid’s Elements the question of whether the 
sequence of primes eventually ends is raised and answered. It is shown 
that the sequence has no end, that after each prime yet another and 
larger prime can be found. 

Euclid’s proof is very ingenious yet quite simple. The numbers 
3, 6, 9, 12, 15, 18,--+ are all multiples of 3. No other numbers 
can be divided evenly by 3. The next larger numbers 4, 7, 10, 
13, 16, 19, + --, which are multiples of 3 increased by 1, are certainly 
not divisible by 3; for example, 19 = 6-3 +4 1, 22 =7-3 +4 1, etc. 
In the same way the multiples of 5 increased by 1 are not divisible 
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by 5 (21= 4-5 + 1, etc.). The same thing is true for 7, for 11, 
and so on. 3 
Now Euclid writes down the numbers 


$.3 ti a3 
295 Boy ee 
ee ee 
7-3-5: 7-1i + Yao 
2-3-5-7-11-13+ 1 = 30,081, etc. 


The first two primes, the first three primes, and so forth, are 
multiplied together and each product is increased by 1. None of 
these numbers is divisible by any of the primes used to form it. 
Since 31 is a multiple of 2 increased by 1, it is not divisible by 2. 
It is a multiple of 3 increased by 1 and hence is not divisible by 
3. It is a multiple of 5 increased by 1, hence not divisible by 5. 
31 happens to be a prime and it is certainly larger than 5. 211 and 
2311 are also new primes, but 30031 is not a prime. However, 
30031 is not divisible by 2, 3, 5, 7, 11, or 13, and hence its prime 
factors are greater than 13. As a matter of fact, a little figuring 
shows that 30031 = 59-509, and these prime factors are greater 
than 13. 

The same argument may be applied as far as one wants to go. 
Let p be any prime and form the product of all primes from 2 to 9; 
increase this product by 1 and write 


9-3-6495) - et ia ke 


None of the primes 2, 3, 5,+-+ p divides N, so either NV is a prime 
(certainly much greater than #) or all the prime factors of WV are 
different from 2, 3, 5,---+, and hence greater than . In either 
case, a new prime greater than p has been found. No matter how 
large p is there is always another larger prime. 

This part of Euclid is quite remarkable, and it would be hard 
to name its most admirable feature. The problem itself is only of 
theoretical interest. It can be proposed, for its own sake, only by 
a person who has a certain inner feeling for mathematical thought. 
This feeling for mathematics and appreciation of the beauty of 
mathematics was very evident in the ancient Greeks, and they have 
handed it down to later civilizations. Also, this problem is one that 
most people would completely overlook. Even when it is brought 
to our attention it appears to be trivial and superfluous, and its real 
difficulties are not immediately apparent. Finally, we must admire 
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the ingenious and simple way in which Euclid proves the theorem. 
The most natural way to try to prove the theorem is not Euclid’s. 
It would be more natural to try to find the next prime number 
following any given prime. This has been attempted but has always 
endec in failure because of the extreme irregularity of the formation 
of the primes. 

Kuclid’s proof circumvents the lack of a law of formation for the 
sequence of primes by looking for some prime beyond instead of for 
the next prime after p. For example, his proof gives 2311, not 13, 
as a prime past 11, and it gives 59 as one past 13. Frequently there 
are a great many primes between the one considered and the one 
given by the proof. This is not a sign of the weakness of the proof, 
but rather it is evidence of the ingenuity of the Greeks in that they 
did not try to do more than was required. 

As an illustration of the complexity of the sequence of primes, we 
shall show that there are large gaps in the series. We shall show, 
for example, that we can find 1000 consecutive numbers, all of which 
are composite. The method is closely related to that of Euclid. 

We saw that 2-3-5-+ 1 = 31 is not divisible by 2, 3, or 5. 
If two numbers are divisible by 2 then their sum is certainly divisible 
by 2 as well. The same is true for 3, 5, etc. Now, 2-3-5 is 
divisible by 2, 3, and 5, so 2-3-5 + 2 = 32 is divisible by 2, 
2-3-5+3 = 33 by 3, 2-3-5 + 4 = 34 by 2,2°3-545= 35 
by 5, and 2-3-5 -+ 6 = 36 by 2. Therefore none of the numbers 
32, 33, 34, 35, 36 is a prime. This argument fails for the first time 
for 2-3-5 + 7 = 37, which is not divisible by 2, 3, or 5. 

In the same way we can find 1000 consecutive composite numbers. 
Let p be the first four digit prime (1009) and form the 1000 numbers 


2-3---p4t 2, 2-3---pt3---, 2-3---p+ 1001. 


Each of the numbers 2, 3, 4, 5,---, 1001, is divisible by one of 
the primes 2, 3,:--, and so is 2:3-::. Therefore each of the 
numbers listed is also divisible by one of 2, 3,---, and hence is 
not a prime itself. We have thus found 1000 consecutive numbers 
none of which is prime, and consequently there is a gap of at least 
1000 numbers in the sequence of primes. 

Naturally a gap of such length will not occur until we have gone 
quite far along in the sequence of primes, but if we go far enough 
we can, by the same method, find gaps as long as we may desire. 

This problem and Euclid’s are very similar both in nature and 
proof, yet the question of gaps in the sequence of primes was not 
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considered by the Greeks. It was taken up by modern mathemati- 
cians along with a great number of other related problems. Most 
of these other problems are not as easy to solve; some remain unsolved 
at the present time, while others have led to entirely new fields of 
mathematics. 

Let us consider one of these further problems which we can still 
handle by the same methods and which will give a certain insight 
into some of the others. The multiples of 3 are 3, 6, 9, ---, and these 
numbers increased by 1 viz. 4, 7, 10,:-- have occurred above. 
The remaining numbers 2, 5, 8, 11, 14, 17, 20, 23,--- are the 
numbers which have the remainder 2 when divided by 3. Do these 
last numbers contain infinitely many primes? That is, does the 
sequence 2, 5, 11, 17, 23,---+ never end? We shall prove that 
there is an infinite number of these primes. 

First we must show that if any two of the numbers Il, 4, 7, 10, 
13,--- are multiplied together, the product is again one of these 
same numbers. Each of the numbers is a multiple of 3 increased 
by 1, say 3x + 1. Ifasecond number is 3y + 1, then their product is 


(3x + 1)(38y + 1) = 9xy + 389 + 3x +1 
(1) = 3(38a7 +9 +x) +1, 


which is again a multiple of 3 increased by 1 and hence is in our 
original set of numbers. 

Now if any of the numbers 2, 5, 8, 11,--- is split into its prime 
factors, at least one of the prime factors must again be one of 2, 5, 
8, 11,---. For example, 14 = 2-7 has the factor 2 and 35 = 5-7 
has 5. To show that this is always true, we note that each of the 
prime factors must be either a multiple of 3, in the sequence l, 4, 
7, 10,+++, or in the sequence 2, 5, 8,-::. The only multiple of 
3 that is a prime is 3 itself, and it does not divide our chosen number. 
If all the prime factors were of the kind 1, 4, 7, 10,---, then by 
the above remark (1) our number would again be of this kind. 
Since it is not of this kind, at least one of its prime factors must be 
one of the kind 2, 5, 8, 1l,:°~. 

We can now proceed as in Euclid’s proof except that we consider 


2°3°6°7 1-4] = 
in place of 
o-9°5-9° i glia 


M is a multiple of 3 decreased by 1, which means that it is one of 
2, 5, 8, 11,---. Just as with WN, it is clear that M is not divisible 


12 


THE SEQUENCE OF PRIME NUMBERS 


by any of the primes 2, 3, 5, 7, 11,---, p. Either M is a prime 
(greater than p), or all its prime factors are greater than p. In 
the latter case at least one of the prime factors is one of 2, 5, 8, +++, 
and hence in both cases we have found a prime of this kind that 
is greater than ~. ‘Therefore the sequence 2, 5, 8,--+ contains 
an infinite number of primes. 

This leaves open the question whether the sequence 1, 4, 7, 10, 
13, +++ contains an infinite number of primes. It is quite possible 
that 2, 5, 8,... could contain infinitely many primes while 1, 4, 7,--- 
contained only a finite number. The fact is that the latter sequence 
also contains infinitely many primes, but the proof of this requires 
completely different methods. In a later chapter we shall gain 
some insight into these methods. The reason for mentioning these 
last problems was to point out how the problems and methods of 
modern mathematics are related to those of the Greeks. This is 
not only true in isolated cases, but in whole areas of modern mathe- 
matics, areas of very great interest in which research is still being 
carried on. 


2. Traversing Nets of Curves 


A streetcar company decides to reorganize its system of routes 
without changing the existing tracks. It wishes to do this in such a 
way that each section of track will be used by just one route. 
Passengers will be allowed to transfer from route to route until 
they finally reach their destinations. The problem is: how many 
routes must the company operate in order to serve all sections without having 
more than one route on any section? 

For a very small city with car lines as in Fig. 1, the problem 1s 
quite simple. One route could go from A to B and a second from 
C to D, both passing through X. Or a line could go from A through 
K to D and a second from B through KF to C. Finally, a line could 
go from A through K to C and another from B through K to D. 
Each of these possibilities necessitates two routes. Of course the 
company could set up a new transfer point at R, end a route there, 
and run a shuttle from R to B. But this would only increase the 
total number of routes needed. By adding more transfer points 
the number of routes could be increased indefinitely, but we are not 
interested in doing this, since the original problem asks for the 
smallest possible number of routes. 
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Fig. 2 is still a fairly simple network of lines, but the problem is 
more complicated in this case. One possibility is for one route to 
run from A around through B, C, D, E, and back to A (a closed route). 


D 
B 
ae 


S 


Fig. 1 


A second route might go from A through F, G, H, to D. Three 
more routes, BF, EG, CH, would be needed, making five in all. 
However, this is not the best possible arrangement. The first two 
routes could be combined into a single one running from A around 
through B, C, D, E, and back to A, and then on through F, G, 
and H to D. This cuts the number of routes to four, but would 
it be possible to set up just three routes? 

In the network of Fig. 3 one route might proceed from A through 
B, C, D, and E, back to A, and then on through F to B. That 
would leave three sections, CF, DF, EF, of which two could be 

A F D 
B 


D < B 


Fig. 2 Fig. 3 


combined into a single route CFD, leaving EF as a route by itself. 
In this case the first two routes pass through F while the third ends 
at F. The question still remains: would two routes do just as well? 

It would not be unduly difficult to list all the possible routes for 
Figs. 2 and 3 (as we did for Fig. 1) in order to be sure we have the 
minimum number of routes. However, this would be quite a long 
procedure for a fairly complicated network, and in any case it 
would not be a very interesting problem. Also, it would no more 
be mathematics merely to list all the possibilities for any given 
network than it is mathematics to multiply together two seven-digit 
numbers. The spirit of mathematics dictates that we seek out the 
essentials of the problem and use them to solve it. 
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The essential thing in this problem is quite simple. We must 
consider where the ends of the various routes must lie. There must 
be an end wherever a section has a free end as at A, B, C, and D 
in Fig. 1. Since in this case there must be four ends, and since 
each route has no more than two ends (a closed route has no ends 
of course), it is clear that there must be at least two routes between 
these four ends. By means of a single consideration we have ob- 
tained the same result as was obtained above by considering all the 
possible cases. 

In Fig. 2 there are no free ends, but here there are places such 
as A where three sections come together. At such a place at least 
one route must end, since two routes can never be on the same 
section. In Fig. 2 there are eight places of this kind, so there must 
be at least eight ends of routes. Eight is an even number, the num- 
ber of ends of four routes. Therefore there must be at least four 
routes; and, as we have already seen, four routes will do. 

In Fig. 3 there are five junctions of the kind we have been dis- 
cussing, and a sixth point F where not three but five sections come 
together. We shall call a point like F a junction of the fifth order. 
At F two pairs of sections might be joined to form routes, leaving 
one over, as was done in the first discussion of Fig. 3. The same 
thing would be true for any junction of odd order; one section is 
always left over, and hence there must be at least one end there. 
We now see that Fig. 3 requires at least six ends (an even number) 
and hence at least three routes. 

For any network, no matter how complicated, we can easily 
count the junctions of odd order and divide by 2 to obtain the least 
possible number of routes. In the three examples so far, the number 
of junctions of odd order was always even, and it turned out that 
half this number of routes was not only necessary but also sufficient 
to meet the requirements of the problem. We would now like to 
determine whether or not the number of junctions of odd order is 
always even, and whether half this number of routes will always do 
for any given network. 

No matter how complicated a network is, it is certainly possible 
to find some system of routes having no more than one route on any 
section. All one needs to do, in fact is to establish separate shuttle 
lines on each section between each pair of junctions. However, 
this will certainly require far too many routes. There will be other 
systems using fewer routes, and we are looking for the best system, 
the minimum number of routes needed. Among all the conceivable 
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systems there will be certain ones that are best — such that no 
other system requires fewer routes. 

It is clear that a system of routes is certainly not best if it includes 
extra ends like the transfer point R of Fig. 1. Neither is it best if 
more than one route ends at a point like F in Fig. 3, for if one 
route came through C' to F and another came through D to F, then 
both could be connected at F to form a single route, and this would 
decrease the total number. In order for a system to be best, the sec- 
tions at each junction must wherever possible be paired into routes. 
This means that just one route will end at a junction of odd order, 
none at a junction of even order, and the total number of ends will 
be the number of junctions of odd order in the network. 

We must still decide whether a system which is best can contain 
a closed route. Such a closed route was mentioned in our discussion 
of Fig. 2, but we connected it at A to the route AFGHD and decreased 
the number of routes from five to four. The resulting route 
ABCDAFGHD is not closed--it has the two ends A and D. This 
same reduction can be made whenever a closed route contains a 
junction of odd order, and a similar reduction can be made if the 
closed route contains only junctions of even order. Let A be such 
a junction (Fig. 4) on a closed route (shown here in the form of a 


Fig. 4 


figure eight). Some other routes through A are shown as dotted 
curves and they may continue on in any way. If the system is 
best no route can end at A, so the line from B through A continues 
on, let us say, through £. We can then combine this line and 
the closed route into a single route running through B to A, around 
the closed route back to A, and then on through E. Since this reduces 
the number of routes, the original system could not have been best. 
However, we should note that in this reduction the newly formed 
route may again be closed, since the original line through BAE may 
have continued on and back to B. If this is the case, further reduc- 
tions may be possible. If at some stage the new closed route contains 
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a junction of odd order, then the next reduction will yield a route 
that is not closed. Otherwise all the junctions of the original net- 
work must have been of even order and the system will finally be 
reduced to a single closed route. 

To sum up our results, we have seen that if a system 1s best, routes 
end only at junctions of odd order and only one route ends at each 
such point. Also, there will be a closed route in the system only 
if all the junctions of the original network are of even order, and 
then the closed route will traverse the entire network. In a best 
system, the number of junctions of odd order is equal to the number 
of ends of routes, and is therefore an even number. Furthermore, 
the minimum number of routes is half the number of junctions of 
odd order, except in the case where all junctions are of even order, 
when the minimum is one (closed) route. 


3g. Some Maximum Problems 


1. Let us compare the areas of various rectangles of two-inch 
perimeter. Some are shown in Fig. 5. The width of each rectangle 
must be less than 1 inch, and the closer it is to 1 inch the smaller 
is the height and also the area. If the height is close to 1 inch, 
then the width and again the area are very small. ‘The inter- 
mediate rectangles have larger area, and one might ask which of 
the rectangles has the largest area. This is a maximum problem. 
It is probably the simplest and oldest of all such problems, and so 
perhaps the most suitable one to use as an introduction. 


eae ea 


Fig. 5 Fig. 6 


This problem is solved in Euclid, Book VI, theorem 27. Our 
proof will use the same principles and will differ from Euclid’s only 
in its presentation. The rectangle ABCD of Fig. 6 is supposed to 
represent any rectangle with a fixed perimeter P. ‘The square 
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BEFG has each side of length 4P, so its perimeter is also P. We 
assert that the square is the answer to our problem, that its area is 
greater than the area of any rectangle (not a square) ABCD having 
the same perimeter. In the figure the shaded rectangle Z is part 
of both the original rectangle and the square. The square also 
contains the area X, while the original rectangle is made up of +8 
and Y. Now AB + BC is half the perimeter of the rectangle, and 
GB + BE is half the perimeter of the square, so these two are equal; 
AB + BC =GB+ BE. This can be written as AG + GB + BC = 
GB + BC + CE, from which we have AG = CE. Thus the rec- 
angle X is just as high as Y is wide. However, the width of X is 
one side of the square, while the height of Y is part of the side of the 
square, and is therefore smaller. If two rectangles (Fig. 7) are the 


Fig. 7 


same length in one dimension, then the one with the larger other 
dimension has the larger area, i.e., X is larger than Y. Therefore 
X + < is larger than Y + Z, and the square has larger area than the 
rectangle. The height of Y would no longer be a part of the side 
of the square only if ABCD were a square itself, and then the square 
BEFG would itself be the original rectangle ABCD. Therefore the 
square 1s larger than all other rectangles with the same perimeter. 

This result has been stated as the Greeks would have put it. It 
can also be writtenas an algebraic formula, which is the way we would 
put it in modern mathematics. Let x and y be the dimensions of 
the rectangle in inches. The area of the rectangle may then be 
given in square inches and its perimeter is x +_y + x + y = 2(x + ») 
inches. ‘Therefore the square has sides $(x + _y) inches in length 


x + y\* ; 

and an area of (*=2) square inches. Thus, if x and y are any 

two positive numbers, the result becomes 1 

x + y\? 

xy = (2) 5 

2 
or 

+ The symbol < means, and is read, ,‘‘is less than’; < means “‘is less than or 


equal to.” Thus, 3 < 5,3 <5,3 <3. Likewise, the symbol > means “greater 
than,” and = means “greater than or equal to.” 
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feck 


Vxy < ; 
Verbally, this might be stated: the geometric mean of two numbers is 
always less than their arithmetic mean. The two are equal only if x 
and y are equal. 

2. We have now seen what is meant by a maximum problem 
and its solution. First a solution is proposed, after which it is proved 
that the figure named in this solution exceeds all the other figures 
with which it is to be compared with regard to a given property 
(in this case, area). It is now possible to turn to the main problem 
of this section: to find the triangle of largest area that is inscribed in a 
given circle. It is likely that this problem was discussed, if not solved, 
at the time of Plato, a century before Euclid. However, neither 
Euclid nor more modern books give the following solution, which 
could easily have been understood and discovered by the Greeks. 

Besides the original triangle ABC, we consider the equilateral 
triangle AjByCy inscribed in the same circle or another equal circle 
(Fig. 8). The area of Aj)By)C, is completely determined, since the 


4p 


Ap Lo 
A ag 
Fig. 8 


equilateral triangle is definitely fixed except that it may be turned. 
We claim that the equilateral triangle is the solution to our problem, 
that its area is greater than the area of any other inscribed triangle. 

We first note that the circumference of our circle is split into three 
equal arcs by the equilateral triangle, while the other triangle breaks 
it up into three unequal arcs. Of these last three arcs, one must 
be larger than one-third of the total circumference of the circle. 
Otherwise the three arcs would each have to be exactly one-third 
the circumference in order to make up the whole. The triangle 
ABC would then be equilateral, but we have assumed that this is 
not the case. In the same way, one of the arcs is smaller than one- 
third the circumference. The third arc may be larger or smaller 
than one-third the circumference; we are not able to decide which, 
but it will not affect our argument. 

Let us suppose that the triangle has been labeled in such a way 
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that the arc AB is less than one-third the circumference, and arc BC 
is more. As in Fig. 9, we measure an arc CB” with a length equal 
to that of AB along CB. The triangle CAB” is the mirror image of 
ACB reflected in the diameter of the circle perpendicular to AC. 
We also measure the arc AB’, which is equal to one-third the cir- 
cumference in the same direction from A as B. Since AB is less 
than one-third the circumference, B’ will be past B. However, 
B' will not be past B’’, that is, it will fall between B and B”. If 
it were past B’, the arc AB’ would be larger than AB”, which is 
the reflection of CB. But CB was greater than one-third the cir- 
cumference, while we made AB’ exactly one-third the circum- 
ference. 

Now since B’ lies between B and B”’, it is higher than B. The 
triangles ACB and ACB’ have the same base AC, and the altitude 


Fig. 9 Fig. 10 


of ACB is less than the altitude of ACB’. Since the area of a triangle 
is one-half the altitude times the base, ACB has less area than ACB’. 
We have found a new inscribed triangle ACB’ of greater area than > 
the original triangle ABC and with one side AB’ equal to the side of 
an inscribed equilateral triangle. 

It may happen that ACB’ is an equilateral triangle. This will 
be the case if the arc AC cut off by the original tirangle was exactly 
one-third the circumference. Then the proof that the equilateral 
triangle is larger than the original triangle would be complete. 
If ACB’ is not equilateral we consider AB’ as the base of the figure 
just as AC was before. To do this we turn the figure around as in 
Fig. 10 until AB’ is at the bottom. We can now treat AB’C with 
base AB’ exactly as we did ACB with base AC. We will end up with 
the triangle AB’C’, which is equilateral since both B’C’ and AB’ 
are each one-third the circumference of the circle. Because ApByCy 
of Fig. 8 and AB’C’ are equilateral triangles inscribed in equal 
circlés, their areas are equal. We have shown that the area of 
AB'C’ is larger than that of AB’C, which in turn is larger than the 
original triangle ABC, always assuming that ABC is not equilateral. 


20 


SOME MAXIMUM PROBLEMS 


This completes the proof; the equilateral triangle has a larger area 
than any other triangle inscribed in the same circle. 

3. The previous result is a particular case of a more general 
statement that can be proved in a similar way. We shall show 
that of all polygons of n sides inscribed in a given circle, the regular polygon 
1s largest. Ifn = 3 the polygon is a triangle and this is our previous 
result. 

In order to prove this we need to notice just one more simple fact. 
If we are given any polygon inscribed in a circle (Fig. 11), we can 
inscribe another polygon in the circle having the same sides but in 
any other order. All we need to do is draw the radii to the corners 
of the polygon and cut the circle into sectors along these radii. 


SES 
Z, 


Fig. 11 


These sectors can then be rearranged in any desired order. It is 
obvious that the new polygon has the same area as the original one. 

The proof now goes just about as before. In the first place, if 
the polygon is not regular, there must be a side that cuts off an arc 
less than one n-th of the circumference, and one that cuts off an arc 
larger than one n-th. In the case of a triangle any two sides are 
always next to each other, but if nm is greater than three the two 
sides we are interested in may not be next to each other. However, 
we have just seen that we can draw a new polygon in the same 
circle, with the same sides and area, but with the two particular 
sides next to each other. Suppose the smaller side is called AB, 
the larger BC. We can measure off one n-th of the circumference 
from A in the direction of B and call its end B’. As before, B’ lies 
between B and the mirror image B”’ of B. Then the new polygon 
with B’ in place of B and all the other corners the same has a larger 
area than the original polygon. Also, one side, AB’, of the new 
polygon is the length of a side of a regular polygon. We can apply 
this same procedure to the n — 1 remaining sides, always rearranging 
the order of the sides if necessary. Continuing a step at a time, we 
will finally arrive at a regular polygon, since a new side becomes 
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the correct length at each step. Because the area is increased, each 

time, the regular polygon has a larger area than the original polygon. 
In a similar way it can be shown that of all polygons of n sides 

circumscribed about a circle, the regular polygon has Jeast area. 


4. Incommensurable Segments and Irrational Numbers 


The measurement of length, area, and volume is at the root of all 
geometry. To measure one line segment by another, we see how 
many times one goes into the other. This is simple enough if it 
goes in exactly without leaving aremainder. If the smaller segment 
does not go into the larger one exactly, then we look at the remainder. 
It may happen that the remainder is one-half, one-third, two-thirds, 
or some other similar fractional part of the segment we are using as 
a measure. Ifso, we have a sort of substitute measure, a fractional 
part of the segment that goes exactly into both the segment being 
measured and the one that is used as a measure. This new segment 
is a “common measure’”’ for the two original segments. 

The earliest geometrical problems undoubtedly involved a com- 
mon measure. For example, if a rectangle has sides 3 and 4 inches 
in length, then by the Pythagorean theorem the square constructed 
on the diagonal has area 

324+ 42 — 9 + 16 = 25 sq. in. 
and the diagonal is consequently 5 inches long. Since the smaller 
side of the rectangle and the diagonal have a 1 inch segment as a 
common measure, they are in the proportion 3: 5. 

It is quite natural to try to do the same thing for a square and to 
look again for a common measure for the side and diagonal. If 
we attempt to do this, we find a remainder (Fig. 12) and are forced 
to try smaller and smaller fractional parts of the segment. The 


Fig. 12 
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question arises as to whether one can eventually find a fine enough 
measure or whether there just isn’t any such measure, that is, whether 
the two segments are ‘“‘commensurable” or “‘incommensurable’. This 
leads to the question whether a line segment can be divided into 
arbitrarily fine divisions or whether there is a limit to the possible 
divisions. Is the line made up of a very large number of small 
indivisible parts? Does it have an “atomic structure’? The con- 
ception ‘of the atomic structure of matter is attributed to Democritus, 
who lived just before Plato. However, there is a difference between 
this and the atomic structure of the line. One can easily consider 
a line as “continuous,” i.e. capable of arbitrarily fine division, and 
still suppose that physical material is strung along it in a series of 
atoms. A fragment attributed to Anaxagoras has also been preserved 
from about the same time; this asserts, in effect, that the line is 
continuous. The fact that only this fragment has been handed 
down does not signify that it was just an accidental utterance. 
Rather, it was a controversial thesis that led to much discussion, 
and it is representative of a time when mankind was making real 
steps toward the solution of this basic problem. 

We can imagine the tremendous effect of the more far-reaching 
discovery that the side and diagonal of a square are incommensurable. This 
discovery is attributed to the Pythagoreans, a secret society of 
Southern Italy about whom very: little is definitely known. Ac- 
cording to a legend, the Pythagorean who made these investigations 
public atoned by perishing in a shipwreck. Perhaps this is more 
allegory than truth, since it may refer to the shattering effect that 
the discovery of irrationals had on the foundations of contemporary 
thought. In any event we have Plato’s own report in his Laws of 
how this discovery excited him when he first learned of it. 

We shall give two proofs of this fact without considering the in- 
teresting historical question of which proof is the older. The second 
was given not only by Euclid but even by Aristotle. The first, 
apparently older, is of the type usual with the Greeks and, in spirit, 
belongs to the tenth book of Euclid. 

For the first proof we must begin by noticing certain facts of an ele- 
mentary geometric nature. The whole argument is easily recognized 
as arising from a vain attempt to find a common measure by measuring 
the side off along the diagonal and then continuing to attempt to 
find a suitable fractional part. We measure off the side along the 
diagonal from B (Fig. 13). It goes in only once and we can call 
the point where it ends D. The line B’D has been drawn perpen- 
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dicular to BD and B’ is the point where it crosses AC. Also B and 
B’ have been connected. We have BA = BD, BB' = BB’, angle 


BAB’ = angle BDB’, the last because each is a right angle. There- 
Aq 


Fig. 13 


fore by one of the congruence theorems, the triangles BAB’ and 
BDB' are congruent and consequently AB’ and DB’, being corre- 
sponding sides, are equal. Also angle B’C’B, between a side of 
the square and a diagonal, is half a right angle, and since angle 
CDB’ was constructed a right angle, just half a right angle is left 
over for the third angle of the triangle CDB’. ‘Therefore C:DB’ is 
an isosceles right triangle and its legs DB’ and DC are equal. 
Combining what we have proved, we have 

(1) Ab =f Ds DX. 

We now erect a perpendicular A’C to the diagonal at C and make 
it equal to DB’. When A’ is connected to B’, A’B’CD forms a new 
square smaller than the original one. ‘The whole process can now 
be applied to the new square, its side is measured off along its 
diagonal B’C from B to give D’, and the line BD’ drawn perpen- 
dicular to B’C. As before, we have 


(2) A’ Boe OD! me DC. 
Clearly this can be repeated again and again, and each time there 
will be a remainder on the diagonal which can be used as a side 


of the next square. Although the process will never end, the 
remainders (which are never zero) become smaller at each step: 


(3) Cis Cis Cp" S Cp 


Each of these remainders is the difference between the diagonal and 
side of the corresponding square: 


(4) CD=CB—AB, CD'=CB'—A’B’, CD’ =CB"—A"B",-::. 
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This completes the geometric preliminaries for the first proof. 
The proof itself will be an indirect one. We assume that the side 
and diagonal are commensurable and show that this leads to an 
impossible situation. If the two are commensurable they have a 
common measure, a segment & that goes exactly into both the side 
and the diagonal. Now if any two segments are exact multiples 
of E, their difference is also an exact multiple of E. Therefore, if 
CB and AB are exact multiples of E, so is CD by (4). Also, by (1) 
we have CB’ = CA — B’A = AB — CD, which is a difference of 
multiples of £ and hence an exact multiple of E itself. The second 
square of Fig. 13 has side A’B’ = CD and diagonal CB’, both of 
which are multiples of E. From the first square we have gone to 
the second and we can go on in the same way; from A’B’ and CB’ 
we find that CD’, AB’, and CB” are multiples of E, and so on 
through the further squares. 

We now come to the contradiction. If CB and AB are exact 
multiples of £, then we have seen that C'D’, CD", CD’, +++ are also 
exact multiples of E. But (3) shows that these multiples of E get 
smaller and smaller without stopping, though they are never zero. 
‘This is not possible since, for example, if CD were 1000 £, then 
CD’, a smaller multiple of Z, would be at most 999 £, etc., until 
at the latest the 1001st term would be less than E. It would then be 
zero, since it is a multiple of EF less than E. This contradicts the fact 
that no term of (3) is zero. 

The second proof is much simpler, and the arithmetical preparation 
for it is shorter than the geometric preliminaries of the first proof. 

We first consider even and odd numbers. An even number is 
twice some other, so that it can be written 2x. An odd number is 
an even number increased by 1, and can be written 2x + 1. The 
square of an odd number is always odd, for 


(2x + 1)? = 4x2 + 4x 4+ 1 = 2(2x? + 2x) 4-1 


is again twice a number increased by 1. From this we can im- 
mediately prove: 

Lemma? 1. If the square of a number is even then the number itself 
1s even. If the number were odd, as we have just seen, its square 
would also be odd. It is even easier to prove: 

Lemma 2. The square of an even number ts always divisible by 4. Thus 
(2x)? = 4x? is 4 times x2, or simply 4g. 

2 A lemma is a proposition not important enough to be called a theorem, 
but which is to be used in the later proof. 
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The main proof is again indirect. We again suppose that the 
side and diagonal of the square have a common measure E. Let 
the diagonal be d times E, the side s times E. Then, applying 
the Pythagorean theorem to one of the right triangles formed by 
two sides of the square and the diagonal, we have 


(5) Pee +s, eee. 


We can suppose that d and s have no common factors since they 
can always be reduced by a new choice of E. For example, 10 and 
16 have the common factor 2, but they can be reduced to 5 and 8 
by doubling the size of the common measure E. From now on we 
shall assume that such a reduction has been made. 

From (5) we see that d? is twice a number and hence is even. 
Lemma | then shows d is also even. Consequently s must be odd, 
since otherwise d and s would have the common factor 2 in contra- 
diction to the fact that they are reduced. However, since d is even, 
lemma 2 shows that d? is divisible by 4, d? = 4g, and 2s? = 4¢ or 
s* = 2g. Therefore s? is even and so is s (by another application 
of lemma 1). This contradicts the fact that we have just shown 
(that s is odd), and hence shows that the original assumption, that 
s and d have a common measure, is false. 

The essence of both proofs is that a decreasing sequence of whole 
positive numbers must finally come to an end. In the first proof 
it is apparent in our discussion of the sequence (3). In the second 
proof it is hidden but really included in the remarks concerning the 
reduction of the numbers d and s. The proof of the fact that the 
reduced form of two numbers can be found depends on a series of 
decreasing steps. 

In the current mathematical notation, formula (5) would be 


written 
i\3 
(<) ae 
s 


Likewise, our final result would be written: There is no fraction 
(no rational number) x = — whose square is 2. This can also be 
5 


stated as: There is no rational number which is equal to 4/2, or 
finally, 1/2 is an “irrational number’’. 
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We shall again consider a problem of the kind discussed in Chapter 
3, but this time it might more properly be called a minimum 
problem. It will serve to introduce mathematical methods that are 
highly refined yet clear and simple. The theorem and the proof 
we shall give here are the work of H. A. Schwarz. Although the 
theorem is only a relatively minor mathematica] problem, it shows 
how this great mathematician’s genius manifests itself, equally in 
relatively trivial and extremely significant work. 

1. Before considering our main theorem, let us look at a very 
simple problem concerned with the law of reflection of light. It 
is well known that if a light Tay starts at A (Fig. 14) and strikes a 
mirror g, it is reflected toward B in such a way that the angle of 

Al 


Fig. 14 Fig. 15 


incidence and the angle of reflection are equal. What we want 
to prove is that the path ADB that the light ray chooses is the shortest 
of all possible paths from A to B that touch the mirror g. Itis the 
path that a steamboat should take if it were required to go from a 
place A to another place B and to touch at the bank g on the way. 
We shall not go into the question of why a light ray, which does 
not have the power of reasoning, chooses the same path that would 
be chosen by the pilot of the steamboat after considerable thought. 
All we shall prove is the purely mathematical fact that the path ADB 
with equal angles of incidence and reflection is shorter than any other path ACB. 

The proof depends on a device that seems artificial from a mathe- 
matical standpoint, but which is quite natural from the point of 
view of optics. We reflect the point A and the lines AC and AD 
in the mirror g (Fig. 15). If A’ is the image of A, then A’C is the 
image of AC and A’D is the image of AD, so that A’C = AC, 
A’D = AD, and A’E = AE. Therefore the triangles EDA and 
EDA’ are congruent and the angles EDA and EDA’ are equal. 
According to our hypothesis we have angle EDA = angle CDB. 
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Therefore angles CDB and EDA’ play the roles of vertical angles; 
that is, A’DB is a straight line. 

Now the lengths of the paths ADB and A’DB, as well as ACB 
and A’CB, are equal. Since A’DB is a straight line connecting 
A’ and B, it is shorter than the path A’CB, and consequently ADB 
is shorter than ACB. Here we have used the fact that a straight 
line is the shortest distance between two points. 

2. We now turn to our main problem, fo inscribe in a given acute- 
angle triangle ABC a triangle UVW whose perimeter is as small as possible 
(Fig. 16). The assertion is that the “pedal triangle’ EFG (Fig. 17), 
whose vertices are the three feet of the altitudes of the triangle ABC, has a 
smaller perimeter than any other inscribed triangle U VW. 


B B 
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Fig. 16 Fig. 17 


We must first prove a lemma concerning the pedal triangle. We 
assert the angles AFG and CFE are equal (as in the law of reflection), 
and consequently that the analogous angles at F are equal and also 
those at G. In order to prove this lemma we must recollect some theo- 
rems of plane geometry: the theorem of Thales, that an angle inscribed 
in a semicircle is a right angle (Fig. 18); that inscribed angles which 
intercept the same arc are equal (Fig. 19); that the altitudes of a 
triangle meet at a point. Using these, we see that the circle with 


diameter AH passes through G and F and that the circle with 
B 


Fig. 18 Fig. 19 Fig. 20 


diameter CH passes through E and F (Fig. 20). Furthermore, 
angle AFG intercepts the arc AG, as does angle AHG, so these two 
angles are equal. In the same way we see that the angles CFE 
and CHE are also equal. But angles AHG and CHE are vertical 
angles and hence are equal. Therefore we have angle AFG = 
angle CFE. 
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3. We can now commence Schwarz’s proof. We reflect the 
triangle ABC in the side BC (Fig. 21), reflect the reflected triangle 
in its side CA’, reflect the resulting triangle in A’B’, reflect in B’C’, 
then CA”, then AB”, a total of six reflections. We first prove 
the fairly obvious fact that the final position ABC” is the original 
position ABC moved parallel to itself without turning. The first 
two reflections shift ABC to the third position A’B’C. This shift 
could have been made without reflections and without lifting the 
triangle out of its plane merely by rotating it about C through 
the angle 2C’ in a clockwise direction. Similarly, the shift from 
the third to the fifth position could be accomplished by a clockwise 
rotation through the angle 2B about the point B’. Finally, a clock- 
wise rotation through the angle 2A about A’’ would yield the seventh 
or final position. In all, the triangle has been rotated one complete 
revolution, through the angle 2C + 2B-+ 2A, since the sum 
A+ B+ C of the angles of a triangle is a, straight angle. The final 
position of the triangle therefore has the same orientation as the 
original; it has merely been moved but remains parallel to itself. 
Therefore BC 1s parallel to B’’C”’. 
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Now we want to trace the various positions assumed by the pedal 
triangle and the triangle UVW under the successive reflections. 
These are shown in Fig. 21 by dotted lines and shading. From our 
lemma concerning the pedal triangle, we see at once that the second 
position of EG forms a straight line with the first position of -FE. 
In the same way, one side of the pedal triangle will always lie on 
the continuation of this line in successive positions. Therefore the 
straight line EE” is made up of 6 segments, 2 equal to FG, 2 to GE, 
and 2 to EF: hence it is equal to twice the perimeter of the pedal triangle. 

Tracing out the positions assumed by the arbitrary triangle UVW 
in the same way, we find that the zig-zag line UV'W'U'V"W"U", 
connecting U and U"’, is equal to twice the perimeter of the triangle UVW. 

We have slocadly seen that the segments UE and UE” are 
parallel since they lie on BC and BC”. They are also equal, since 
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they are corresponding segments in two positions of the triangle 
ABC. Then by a theorem of plane geometry EEUU” is a paral- 
lelogram, and consequently its other two sides are equal, UU" =EE”. 
Therefore UU’’ is also equal to twice the perimeter of the pedal 
triangle. The straight line UU” connecting U and U”' is shorter 
than the zig-zag line connecting the same two points, and the zig-zag 
line is twice the perimeter of UVW. Therefore the perimeter of 
the pedal triangle is less than the perimeter of UVW, as was to be 
proved. 

This proof is typical of many truly mathematical proofs. It 
consists essentially in transforming the hypotheses and conclusion 
until the true kernel of the theorem can be recognized at a glance. 


6. A Second Proof of the Same Minimum Property 


1. Inthe last chapter we proved that of all the triangles inscribed 
in a given acute-angle triangle, the pedal triangle has the smallest 
perimeter. It is worthwhile to consider another proof of the same 
theorem, because this second proof will illustrate some new ideas 
and, for our purposes, the methods used are of more importance and 
interest than the mere mathematical content of new theorems. 
The previous proof, originally given by H. A. Schwarz, depended 
essentially on the fact that a straight line is the shortest distance 
between two points, and it made use of the idea of reflection of a 
figure in a line. These two principles also form the basis of the 
second proof, and it is of interest to contrast the manner in which 
they are used in the two proofs. The following proof was given by 
L. Fejér, who discovered it as a student and thereby won especial 
recognition from H. A. Schwarz. 

2. In the given acute-angle triangle ABC (Fig. 22), let UVW be 
an arbitrary inscribed triangle with U on BC, V on CA, W on AB. 


b c 
Fig. 22 


Let U be reflected in the two lines AC and AB and call U’ and U" 
the two images. Now UV and U’V are mirror images of each other 
and hence are equal. For the same reason UW and UW are also 


30 


A MINIMUM PROPERTY OF THE PEDAL TRIANGLE 


equal. ‘I'he perimeter of the triangle UVW is U V+ VW+ WU 
and therefore it is equal to the length of the path U’VWU". 

If we hold U fixed but move V and W to new positions, then the 
points U’ and U”’ remain fixed because they depend only on U and 
the triangle ABC. The path U’VWU" then connects the two fixed 
points U’ and U”, and its length is always equal to the’ perimeter 
of UVW. The shortest path from U’ to U"” is a straight line. There- 
fore the straight line segment U'’U”’ is the smallest possible perimeter 
for an inscribed triangle with one vertex held at U. This minimal 
triangle with vertex at U, which we shall call UMN, is shown in 
Fig. 22. 

3. Having found the triangle of smallest perimeter with vertex 
at U, we need only compare the minimal triangles for various 
positions of U and pick out the one with smallest perimeter. That 
triangle will certainly have the smallest perimeter of all inscribed 
triangles. 

We must determine the position of U so that the segment U’U”’ 
is as small as possible. For this purpose we first notice that the 
triangle AU’U”’ is isosceles, with AU’ and AU” as equal sides. 
Indeed these two segments are each mirror images of the same 
segment AU and hence are equal, AU = AU’ = AU”. 

Although the legs of the triangle AU’U” are equal in length to 
AU and hence depend on the position of U on BC, the size of the 
angle U"’AU' does not depend on the position of U. This angle is com- 
pletely determined by the original triangle ABC and nothing else, 
since (because of the reflections) we have the following equations 
between the angles of the figure: 


UAB = U" AB, UAC = U'AC. 


From the first we have 
U” AU = 2UAB, 
and from the second, 
U’AU = 2UAC; 
and hence 
U’AU + U" AU = 2UAB + 2UAC 
or 


U'AU” = 2BAG, 


which proves our assertion concerning the angle U’AU”, 

4. In the isosceles triangle AU’U” we want to make the base : 
U'U" as small as possible. Since the angle at A does not depend 
on U, all these triangles, for different positions of U, have the same 


31 


A MINIMUM PROPERTY OF THE PEDAL TRIANGLE 


vertex angle. Of all these the one with the shortest base will also 
have the shortest legs. The legs AU’ and AU” have the length AU. 
Therefore we will obtain the shortest segment U'U” if we choose U 
so that AU is as short as possible. 

Now the segment AU connects the point A with the line BC, 
and it is well known that the shortest distance from a point to a 
line is the perpendicular distance. Therefore we must choose U 
so that AU is perpendicular to BC, that is, so that AU is the altitude 
of the triangle ABC through A. 

5. Let us now construct this triangle EFG of smallest perimeter 
(Fig. 23). Let E be the foot of the perpendicular to BC through A. 
If E’ and E” are the images of E under reflection in AC and AB, 
then E’E” is the length of the smallest perimeter of an inscribed 
triangle. The two points F and G where the line £’E” cuts AC 
and AB are the other two vertices of the minimal triangle. 

A 


Fig. 23 

If we think back over what we have done, we see that every 
inscribed triangle UVW, different from EFG, must have a larger 
perimeter. For if U is different from Z, then the segment U'U" 
is larger than E’E” and the perimeter of UVW is at least as large 
as U'U". If U and E coincide, then one or both of the points V and 
W will differ from the points F and G, and the path E’VWE” will 
differ from the straight line E’FGE’’. In both cases, then, the 
perimeter of UVW will actually be greater than the perimeter 
of EFG. 

6. These considerations have shown that the problem of finding 
an inscribed triangle of least possible perimeter has only one solution. 
We shall make use of this uniqueness of the solution. Our construc- 
tion of the minimal triangle did not treat all three vertices in the 
same manner. One vertex E is the foot of the altitude through A, 
but the other two were found by a construction having nothing to 
do with the altitudes through B and C. 

We could have carried out our argument starting with the vertex 
B in place of A. That is, in § 2, instead of reflecting the point U 
in the sides AB and AC, we could have reflected the point V in BA 
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and BC and continued accordingly. We would then have ended 
with a minimal triangle whose vertex F was the foot of the altitude 
through B. Since there is only one minimal triangle, this construc- 
tion starting with B must have led to exactly the same triangle EFG 
as the original construction starting with A. Since we could also 
start with the vertex C, we may conclude that in the minimal 
triangle EFG not only E but F and G as well are feet of altitudes. 
Thus we have proved the theorem. 

7. We can still get a little more from the proof. Making use 
of the uniqueness of the solution, we saw that if E had a certain 
property (being a foot of an altitude), then F and G also had the 
analogous property. Similarly, any property that F and G have by 
our construction will also hold for E. Because E’ is the mirror 
image of E, the angles EFC and E’FC are equal. Since E’FC and 
GFA are vertical angles, they are also equal, and angle EFC = 
angle GFA. That is, the two sides of the minimal triangle that go 
through #’ form equal angles with the side AC of the original triangle. 
The corresponding statement holds for the point G. If we had started 
our construction with F as the foot of the altitude through B, then this 
same proof would show that the angles GEB and FEC are also equal. 

Disregarding the minimal property of the triangle EFG, we know 
from §6 that EFG can be characterized as the pedal triangle. 
Combining these two results, we have the theorem: if a pedal triangle 
is inscribed in an acute-angle traingle, then the two angles formed 
at each vertex of the pedal triangle by the two sides of the pedal 
triangle and the side of the original triangle are equal. 

This theorem now contains nothing concerning a minimum. It 
is the type of theorem customarily found in elementary geometry, 
and it could be proved by the methods of elementary geometry. 
In fact, we have actually done just that in Chapter 5. Schwarz’s 
proof needed this result as a lemma, and we supplied it by making 
use of a number of theorems concerning circles. An advantage of 
Fejér’s proof is that it makes use of nothing other than the principle 
of the shortest distance and reflections. Furthermore, Fejér’s proof 
is distinguished by the fact that it uses only two reflections, while 
Schwarz’s employs six. 

8. There is a counterpart to the theorem on the pedal triangle: 
In every acute-angle triangle there is one and only one point the 
sum of whose distances from the three vertices is a minimum. This 
point is so situated that the lines that join it to the three vertices 
form angles of 120° with each other. 

This theorem was proved by L. Schruttka by a method suggested 
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by analogy with Schwarz’s proof of the pedal triangle theorem. A 
much shorter proof has been given by Biickner, and this is the one 
we shall use. 

Let P (Fig. 24a) be an arbitrary point in the acute-angle triangle 
ABC. Let the triangle ACP be turned about the point A through 
60° to the position AC’P’. ‘This rotation is to be made in such a 
direction that AC turns out of the triangle, so that finally the line 
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AC lies between AB and AC’. Then we have C’P’ = CP and 
PP’ = AP, for the triangle APP’ is not only isosceles but equilateral, 
its angle at A being 60°. Then the path BPP'C" represents the sum of 
the distances of P from the three vertices A, B, C. The point C” is in- 
dependent of the position of P. All the paths corresponding to 
various positions of P join B and C’. The shortest of these paths 
is the straight line BC’ (Fig. 24b). Therefore the minimal point 
P, must lie on BC’, and its position is completely determined by the 
fact that angle AP,C’ = 60°. The supplementary angle APB is 
then 120°. The construction shows that there can be only one 
minimal point Py. Consequently the same construction with A 
replaced by another vertex will lead to the same point Py. There- 
fore the angles BP,C and CPA are also 120°. 


7. The Theory of Sets 


The subject of this chapter lies at the very foundations of mathe- 
matics. However, our interest in it will depend more on the beauty 
and simplicity of the manner in which it is built up than on its 
significance for mathematics in general. The theory of sets, origin- 
ated by Georg Cantor, is a truly mathematical theory which starts 
with only the very simplest concepts and builds up to a ramified and 
important subject through the use of pure reasoning. 

Are there more whole numbers than even numbers? Which are 
more numerous, the points of a line segment or the points of the 
surface of a square? It is just such questions as these that started 
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Cantor on his theory. It is important to avoid jumping to con- 
clusions when trying to answer these questions. The questions as 
they are put are not precise because we don’t know exactly what we 
mean by one being more than another. Cantor’s first important 
step was to give them a precise meaning by using simple methods 
of counting off, as is done for finite numbers, and by carefully 
differentiating between cardinal and ordinal numbers, a distinction 
that is merely grammatical for finite numbers. 

A simple example will point out the direction in which we are 
to go. Suppose we are in a dance hall and are asked whether 
there are more men or women present. What would be the easiest 
way to decide? One method would be to separate the men and 
women into two groups, to count each group, and to compare the 
numbers. A simpler method, however, would be to siart the dance. 
The men and women would pair off, and it would only be necessary 
to observe whether those left over were men or women. We suppose 
that everyone dances if he can find a partner. 

This principle of pairing off was adopted by Cantor as a starting 
point. If we wish to decide whether there are more whole numbers 
than even numbers we are to try to pair them off. In fact, we 
can find a pairing in which each whole number is paired with an 
even number and none is left out. We do it as follows: 


eee =, 0, 6, *¢* 
2, 4, 6, 8, 10, 12,--- 


where each whole number in the upper row is to be paired off with 
the even number directly below it. By this method every number 
gets paired off and none is left out. This simple fact is quite 
remarkable. The two rows have been paired off exactly and yet 
the lower row consists of part and only part of the upper row. 

This brings up an essential difference between this case and the 
case of finite numbers. In the case of the dance hall (where there 
is a finite number of dancers) it is quite immaterial which man 
dances with which woman. The number of those sitting out is 
always the same and it does not change as long as no one enters or 
leaves the hall. It is quite different in the case of the whole numbers 
and even numbers. We have seen a pairing off that comes out 
exactly, including every element, but it is easy to give another which 
does not do so. We can use the most natural pairing: 2 with 2, 
4 with 4, 6 with 6, etc. We have paired off the even whole numbers, 
but all of the odd ones are left over. The essence of Cantor’s theory 
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is that it abandons the idea of arbitrary pairing and only demands 
that we find some one pairing off that comes out exactly. If such 
a pairing can be found, the two sets are said to be of the same power. 

Cantor’s next step was to prove that the set of all rational numbers, 
i.e. all whole numbers and fractions, does not have higher power than the set 
of whole numbers. ‘To construct the appropriate pairing we arrange 
the fractions not in order of size but according to the size of the sum 
of their numerator and denominator. We consider only reduced 
fractions and begin with those the sum of whose numerator and 
denominator is 2. There is only one such fraction: 1/1 = 1. Those 
with sum 3 are 1/2 and 2/1 = 2. Next come 1/3, 2/2, 3/1, with 
sum 4, but 2/2 is omitted because it is not reduced. These are then 
followed by 1/4, 2/3, 3/2, 4/1, etc. We now write the fractions 
in this order under the series of natural numbers, 


1,2, 3, 4, 5, 6 4, - 3%, 9, Sr 
1, 1/2, 2, 1/3, 3, 1/4, 2/8, 3/2, 4 jf ee 


and can pair off each whole number with the fraction directly 
under it. The lower row does not leave out any rational number, 
because each one has a definite sum of numerator and denominator 
and must therefore have some place in the row. This, then, is 
the pairing off we desired. 

This surprising result is expressed by saying that the set of rational 
numbers is “‘countable’’ or ““denumerable” because the pairing off 
with the natural numbers is, in effect, a counting off of the rationals. 
In general, a set is denumerable when it can be paired off with the 
natural numbers, that is, when it has the same power as the set of the 
natural numbers. Cantor goes on to show that several othersets which 
are much more extensive than the natural numbers do not actually 
have higher power. We shall pass by these examples since they 
involve further mathematical concepts and our one example is 
adequate. 

Thus far all the infinite sets considered have had the same power. 
Cantor’s theory would be trivial if there were no sets of higher power. 
Cantor showed that sets of higher power do exist by proving that the 
set of points of a line segment is of higher power than the set of natural 
numbers. The proof is an indirect one. We suppose that there is a 
pairing between the points of, say, a 1 inch segment and the natural 
numbers, and show that this leads to a contradiction. This pairing 
off puts the points in an order, the first paired with 1, the second 
with 2, etc., but this will obviously not be the natural order of the 
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points on the line. It is convenient to identify each point by its 
distance from one end of the line segment. For example, the middle 
of the segment corresponds to the number 0.5 and each point 
corresponds to some decimal fraction. In order to measure the 
points exactly we must use infinite decimals; for example the end 
of the first one-third of the segment corresponds to 0.33333---. In 
the pairing off of the points of the segment with the natural numbers 
we can use the corresponding infinite decimal fractions instead of 
the points themselves. In the first place there will be some decimal 
0. : + + corresponding to 1, in the second place a similar decimal, etc. 
It is a little easier to list them in a vertical column, giving us a list 
of the type shown below where we have inserted particular numbers 
merely for purposes of illustration. 


0.35420:-:: 
0.61773 --- 
0.55549 -- > 
0.01007 --- 
0.20206 ::-: 


Se set sist ai 


Now we shall show that there is a decimal fraction 0. +--+ which, 
contrary to our assumption, is not included in the list. This decimal 
can be found as follows: we choose for the first digit after the decimal 
point a digit that is different from the first digit of the first decimal 
in the list. This gives us a choice of 9 digits. In order to make 
it definite, let us choose the digit 1 unless the first digit of the first 
decimal in the list is 1, in which case we shall choose 2. Now it is 
clear that our new decimal will differ from the first decimal of the 
list no matter what choice is made for the remaining digits, for the 
two certainly differ in their first digits and will represent different 
points even if they agree in all the other digits. We now choose 
1 to be the second digit unless (as in the above example) the second 
digit of the second decimal of the list is 1, in which case we choose 
2. Therefore in either case the second digit of our decimal is 
different from the second digit of the second decimal of the list. 
Then our decimal will differ from the second decimal of the list as 
well as from the first. We continue choosing digits in this way. 
In the illustrative example our decimal would begin 0.12111: >>. 
Since the process can be continued indefinitely, we have defined an 
infinite decimal fraction which is different from all the decimals 
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in the list. The fact that the list does not contain this decimal 
contradicts the assumption that they could be paired off. Therefore 
the points of a line segment are not denumerable. 

There is a certain objection to the above proof which must 
be mentioned. It is caused by the decimals that end in an in- 
finite row of 9’s, like 0.269999---, which is no different from 
0.270000 --- = 0.27. The difficulty is that two different decimals 
can represent the same point if one of them ends in 9’s, and this 
means that we were not justified in replacing the points of the 
segment by the set of all decimal fractions. The two sets do not 
correspond exactly. Cantor avoided this difficulty by putting the 
proof in quite a different form, but there is a very simple way to 
dispose of the objection. We merely rule out all decimals ending 
in 9’s. That is, we would not identify a point by 0.269999--- 
but would use 0.270000---. All that we have to do is to make 
sure that the decimal we construct does not end in 9’s._ But this 
does not occur, since it is made up entirely of the digits 1 and 2 
and does not contain any 9’s. 

This result has an interesting consequence. Since the set of 
rational numbers is denumerable, we see that it is of lower power 
than the set of all numbers between 0 and 1. Therefore there must 
certainly be numbers between 0 and 1 which are not rational. Thus 
we have proved the existence of irrational numbers by means of 
quite general considerations. The same thing was proved in an 
entirely different way in chapter 4. 

Cantor’s next result is also rather surprising. It states that the 
set of points of the surface of a square does not have higher power than the 
set of points of a side of the square. ‘The reason this is so surprising is 
that it contradicts the intuitive idea of dimension. A one-dimen- 
sional segment has the same power as a two-dimensional square, 
and it can also be shown that a three-dimensional cube also has the 
same power. 

As in the previous proof we shall use infinite decimal fractions 
and again we shall exclude decimals ending in 9’s. The points of 
the segment will again be represented by decimals 0.--- just as 
before. The points of the square will be characterized by a pair of 
decimals, one decimal, x, representing the horizontal distance from 
the left side of the square and another, y, representing the vertical 
distance from the lower side of the square (Fig. 25). We shall 


3 For a further discussion, see R. Courant and H. Robbins, What is Mathematics? , 
1941, Oxford University Press, New York, pp. 64-66, 80-82. 
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set up a pairing in the following way. Each point P of the square 
is characterized by two decimals 


Zz 


Fig. 25 
% = 0.0490, °° -, y = 0.b4),0,°°*. 
From these two we form the single decimal 
Z = 0.41) ,agbgd3b, °° -° 


by alternately taking digits from x and y. Then we pair off the 
point P of the square with the point Q of the segment characterized 
by the decimal z. For example, the center of the square has 
x = 0.500---, y = 0.500---, and it is paired off with the point 
of the segment corresponding to z = 0.550000---. In this way 
each point of the square is paired off with some point of the segment. 
But this is not quite enough. If we had only to pair each point of 
the square with some point of the segment we could merely pair 
each point P with the point directly above Pon the upper side of the 
square. But then each point of the side would be paired, not with 
just one point of the square, but with an infinite number (all the 
points on a vertical line). But this is ruled out in Cantor’s method 
of comparing powers of sets, just as in the case of the dancers we 
did not suppose that one person would dance with several partners. 
Our original, more refined, method of pairing off avoids this diffi- 
culty. For if a second point P’ of the square with, say, 


x’ = 0.a,a,4,°°°, y’ = 0.01505 °° - 
was paired with the same point Q of the segment, then we would have 
Z = 0.a,b,a,b,a3b5 - - 
This decimal expression for z and the original one can be equal only 
if the two correspond exactly in all digits, 
th, 2 b,, a, = ay, by = by + **. 


But this shows that all the digits of x’ are equal to the corresponding 
digits of x, and the same is true for y’ and y. Therefore we have 
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x’ =x, y' =, and P’ is the same point as P, contrary to the as- 
sumption that it was a different point. Consequently two different 
points of the square are never paired off with the same point of the 
segment. 

The pairing will be complete when we show that no point of the 
segment is left unpaired. This is easily seen, for if 


Z = 0.CyColglql5Cg * °° 


corresponds to any point of the segment, then the point of the 
square with 
x= 0.€,C3C5 a 2 — 0.Col4Cg = 


is paired off with the point corresponding to the decimal obtained 
by taking digits alternately from x and y, and this is exactly the 
decimal z with which we started. 

There is still an objection to this proof which arises from decimals 
ending in 9’s. The point on the segment with z = 0.2202020--- 
is paired with the point of the square having 


x = 0.2000 +++, y = 0.2222--+, 


The point on the segment with z’ = 0.12929292-.-- is paired with 
the point of the square having 


x’ = 0.1999 +++, py’ = 0.2222---, 


but here we have a decimal ending in 9’s and we should write 


x’ = 0.2000---. Therefore the two different points on the segment, 
corresponding to z and z’, are paired with the same point of the 
square. 


A very simple, although not obvious, trick will eliminate this 
difficulty. In the exact form in which it is given above, the proof 
is not correct, but if we make the one following modification it 
becomes valid. Whenever a digit 9 occurs in a decimal, we com- 
bine it with the digit to its right to form an inseparable “molecule”. 
If there are several consecutive 9’s we include all of them and the 
next digit in the molecule. Thus in the above example we would 
have 0.(1)(2) (92) (92)(92) --- and, as a new example, we would 
write z’’ = 0.(7)(3)(94)(990)(9997) ---. ‘These molecules are to 
take the place of the digits in the proof, so that from z”’, for example, 
we would have 


x'’ = 0.(7) (94) (9997) «++, y’” = 0.(3)(990) - ++. 
The pairing is now quite different from what it was before but, 
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after a moment’s reflection, we see that the proof now goes through 
without difficult so long as we remember to rule out decimals 
ending in 9’s, 

At this point Cantor encountered a problem which has since 
become famous. This was the question whether there exists a set 
having a higher power than the set of natural numbers but a lower 
power than the set of points of a segment. 

This problem has been named ‘“‘the problem of the continuum”. 
It defied all of Cantor’s attempts to find an answer, as well as those of 
his successors. Probably no other mathematical problem formulated 
with so little mathematical preparation has ever defied solution so 
stubbornly as this one. The only concepts used in the formulation 
of the problem are those of whole numbers and line segments. 
There is nothing remarkable in proposing one or many very difficult 
problems if one uses complicated mathematical ideas. Using only 
simple ideas to formulate a problem that is neither easily solved 
nor trivial reveals the true art of mathematics. Considered from 
this point of view, the problem of the continuum stands out as a 
shining example. 

It soon became evident that the fundamental theory of sets, on 
which all of mathematics depends, could not be properly developed 
until the concept of what is meant by a set had been carefully 
analyzed. One is forced to contemplate such an analysis because 
of the following famous paradox in the theory of sets. 

We have been talking of sets. These sets consist of ‘elements’ 
of one sort or another. The set of points of a segment contains as 
elements the individual points of the segment. The whole numbers 
themselves are the elements of the set of whole numbers. The 
relation between the sets and elements is the same as that between 
an association and its members. Sometimes an association has as 
members not individuals but associations. For example, the 
United Nations is an association of nations, each cof which is an 
association of members or citizens. The actual membership of the 
United Nations is made up of individual countries, not of the citizens 
of those countries. In the same way a set can contain sets as 
elements. An example is the set of all denumerable sets, all of 
whose elements are sets in themselves. Being a citizen of a parti- 
cipating nation does not make a person a member of the United 
Nations. Similarly the number 1/5 is an element of the set of 
rational numbers, which we have proved denumerable, but that 
does not make it an element of the set of all denumerable sets. 
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Can a set contain itself as an element? The ordinary sets which 
we have been considering do not have this property. However, it 
is easy to give examples to show that such extraordinary sets do 
exist. The set of all conceivable sets is of this type, since it is a 
particular set in itself. For the moment we shall call a set that 
contains itself as an element an extraordinary set, while we shall 
call the others ordinary. 

Now let us consider the set of all ordinary sets, and let us call it 
the set s. Is s itself an ordinary or an extraordinary set? It must 
be one or the other. If s is extraordinary then it must contain 
itself as an element. But then s is a member of s and hence is an 
ordinary set, as are all elements of s. This is a contradiction, so 
s is not extraordinary. However, if s is ordinary, then it does not 
contain itself as an element. ‘Therefore s is not an element of s, 
which contains all ordinary sets as elements, and so s is not ordinary. 
This is again a contradiction, so s is not ordinary. This is the 
paradox: s must be either ordinary or extraordinary, but each 
possibility leads to a contradiction. 

This paradox is not specifically restricted to the theory of sets. 
In order to make this clear we shall reformulate the same paradox 
in a rather frivolous way. This formulation is completely free of 
the idea of sets. In a certain regiment a soldier is detailed to take 
over the duties of barber. His exact orders direct him to shave 
everyone in the regiment who does not shave himself. Should the 
soldier shave himself? If he shaves himself, then he is one who 
shaves himself, and his orders direct him not to. If he doesn’t 
shave himself then he is one who does not shave himself and again 
he violates his orders. What can the man do to carry out his 
orders strictly? 

The paradox is a purely logical one. We are inescapably led 
to it in the theory of sets, but it is more general than that and it 
does not need that theory for its formulation. The old, rather dull, 
subject of logic has developed into something quite interesting. As 
a matter of fact, both mathematicians and logicians have been work- 
ing for some time to free logic from its old Aristotelian form, but at 
present it is difficult to see just what new form it will take. 
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1. A simple example will serve to show what type of problem 
we shall discuss. Suppose we have 4 red balls (R), 1 yellow ball 
(1’), and 2 white balls (W). These balls are supposed to be of the 
same size and weight and completely indistinguishable except by 
their colors. We also suppose that we have two urns, A and B. 
Urn A will hold exactly 3 balls, B will hold 4. In how many different 
ways can the 7 colored balls be distributed between the urns A and B? 

Since we have a very simple case with only two urns, we need 
only consider the balls that are put into A. The remaining 4 
balls will then have to be put into B. To answer our question we 
shall systematically list all the possibilities. First of all, A might 
contain only red balls and B whatever is left over: 


i in A: RRR, in B: RYWW. 


It does not matter which particular three red balls are put into A, 
since we supposed that the balls could not be distinguished from 
each other except by their colors. 

Next A might contain just 2 red balls, in which case the third ball 
must be either yellow or white. This gives the two distributions 


2. in A:RRY, in B: RRWW, 
38. in 4:RRW, in B: RRYW. 


If A contains just 1 red ball, then the other two will clearly 
have to be either YW or WW, giving 


4, in A:RYW, in B: RRRW, 
5. in A:RWW, in B: RRRY. 


Finally, if A contains no red balls, it must contain the other three: 
6. in A: YWW, in B: RRRR. 


Therefore we see that there are 6 possible ways to distribute 4 red, 
1 yellow, and 2 white balls between two urns which hold 3 and 4 
balls respectively. Clearly it is quite immaterial of what colors the 
balls are; we would get the same result with 4 black, 1 green, and 
2 blue balls, so it is necessary only to tell the number of balls of 
each color, not their colors. In order to state our result a little 
more briefly, we will then say: the number of distributions of 4,1, 2, 
balls between urns of contents 3 and 4 is 6. We will also write 
the same fact in symbols in the form: 


{4, 1, 2]3, 4} =6. 
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Here the numbers in front of the vertical bar are the number of 
balls of the various colors, and the numbers after the bar are the 
number of balls that each urn can hold. The sum of the numbers in 
front of the bar must equai the sum of the numbers after the bar, since all 
the balls together just fill all the urns. This total number of balls, 
which is 7 in this case, has been written as a subscript to the right 
of the bracket. 

2. It is not necessary to consider just three colors and two urns. 
As a general problem we might consider z balls of ¢ different colors 
and u urns of total content nm. We would then use the symbol 


(1) 4 eer at 


for the number of distributions of m balls, of which r are one color, 
s another, etc., among urns which will hold a, 6,--- balls res- 
pectively. The problem would be to compute the value of < from 
the numbers n, 7, s,- °°, a, b,:*-. We will not solve this problem 
in such a general form, but rather will restrict ourselves to a number 
of examples and important special cases. 

The problem does not really require the objects distributed to be 
colored balls. For example, we have already used the letters 
RRRRYWW to designate the 4 red, 1 yellow, and 2 white balls, 
and have distributed these 7 letters between 2 sets A and B instead 
of distributing 7 balls between 2 urns A and B. 

3. We shall take up a number of examples having nothing to 
do with colored balls, but we will try to discover a way of interpreting 
each case as a distribution of colored balls. ‘These examples are of 
considerable interest and importance and they will serve to show 
the significance of the symbol (1). 

Example I. In how many ways can n persons be seated at n places? 
In order to interpret this in the form of our previous problem, we 
note that each person can be distinguished from every other one. 
Therefore we can designate each person by a different color, a name, so 
to speak. The problem then reduces to: in how many ways can z balls 
of n different colors be put in n different places? Each of these n places 
corresponds in our old problem to an urn that will hold just one ball. 
Using our symbol, the problem reduces to finding the value of 


(2) Pood ee 


where there are 7 1’s in front of the vertical bar, corresponding to 
n different colored balls, and n 1’s after the bar corresponding to 
the z urns, each holding just 1 ball. 
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If we think of the z urns or places as being put in a row, then we 
are asking in how many ways 7 balls or persons (or any distinguishable 
objects) can be arranged in a row? Such an arrangement is called 
a permutation, so we can say that P,, of (2) represents the number of 
permutations of n different things. 

There still remains the problem of finding the numerical value 
of P,, if we are told how big nis. We shall find a formula for P, 
in terms of n, but shall postpone it until we have taken up a number 
of further examples. 

4, Example II. In the game of skat 32 different cards are used 
and 3 players participate. Each player is dealt 10 cards and the 
remaining 2 cards go into the “‘skat’”’. In how many different ways 
can the hands be dealt out? Clearly the number of ways is the 
same as the number of distributions of 32 differently colored balls 
among 4 urns that will hold 10, 10, 10, 2 balls respectively. There- 
fore the number of ways the cards can be distributed in a game of 
skat is given by 


(3) S = {1, 1,-++ | 10, 10, 10, 2}go. 


Again we will postpone the numerical computation of this symbol. 

5. Example III. ‘The so-called polynomial theorem is another in- 
teresting example. We shall consider only a special case involving 
three variables x, y, z. ‘The expression 


(x +» + z)" 


is to be multiplied out. This m-th power represents a product of 
n equal factors, each of which is (x + y+ z). If a sum of terms 
enclosed in parentheses is to be multiplied by a factor, then each 
term inside the parentheses must be multiplied by that factor. If 
this rule is applied to all the m parentheses in the n-th power, then 
it is seen that the power consists of a sum of products. Each of 
these products has n factors, which can be x or y or z. Since the 
factors of a product can be taken in any order we can bring all the 
x’s together, then the y’s, and then the z’s. Then each product 
will be of the form x*y®z* where a, 6, c, are whole numbers subject 
to the restrictions 


(4) a+b+e=n, 420,620, ¢20. 


A factor x can originate in any of the 2 parentheses and so can factors 
y and z. Therefore there can be several products x*y’z° with the 
same values of a, 0, c. How often will x*y’z° arise when we multiply 
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out the n-th power? One of the factors x, y, or z must originate 
from each of the n parentheses. We can imagine a row of z urns, 
one corresponding to each parenthesis. Then we can put into 
each urn the letter that originates in the corresponding parenthesis. 
Now we need only count how many ways a elements “‘x’’ (red balls), 
b elements “‘y” (yellow balls), and ¢ elements “‘z’? (white balls) can 
be distributed among n urns each holding just one element. We 


shall designate this number by P\”), and we now have 


(5) Pe = {ab ¢.| 1,1, +3 


a,b,c 


When the power of (x + _y + z)” is multiplied out, the term x*yz° 
will occur P\”), times. These like terms can be collected, and the 


combined term will then have the coefficient P\"),. 

A particular case will help to make this result clearer. Let us 
take n = 4 and think of multiplying out the power (x + » + z)4. 
Terms of the type x*yz* will occur with all values of a, b, ¢ that 
are consistent (4) with n = 4. Listing them systematically, we find 


the following 15 possibilities: 


x", Be rags 
xy, x*Z, xy, Is; x23, bo 
xy", x28, ye, 
XZ, XZ, HR’, 
Supplying these terms with the coefficients we have found, we get 


(+97 + z)*=P tho oe vee ae pats et 
+ Peo Oy + P31 8% + Pi, x9 + P 031 J°% 

(6) + Pi§ axe? + PR 
see 33,0 yf he ie alles oe raat 
+ PY x92 + PR aye + PQ xyz. 


This gives only the form of the power. We must still find the 
values of P{!}, and, more generally, of P\”) ,. 

6. Example IV. In how many different ways can k things be 
chosen from among n different things? This number is usually 
called the number of combinations of things taken k at a time and 
is designated by the symbol Ci”. The n things are all different, so 
we can consider them as differently colored balls, r = 1, s = 1,---. 
The k balls that are chosen can be put in one urn, a = k, and the 
remaining ones can be put in another urn, b = n — k. The number 
we are trying to determine is therefore 


(7) Cy” Hats Eytan |k, w— hha 
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7. Lhe duality of the distribution symbol. The symbol that we have 
used to represent the number of distributions has a very important 
property that will help us to compute its value. The numbers in 
front of the vertical bar and those following it can be interchanged, 


(8) eee, 0, *} 1G be ee 


Stated in terms of distributions, this asserts .uwac there are exactly 
the same number of ways to distribute r red balls, s white, - - - among 


urns that will hold a, b,--- balls respectively, as there are ways 
to distribute a red balls, 6 white,--- among urns that will hold 
r, 5,**+ balls respectively.. Here, as always, we are supposing that 
rist---=a+tbh4+---=n, 


Expressed in this way, in terms of distributions, the equality (8) 
is very easily proved. A simple numerical example will completely 
demonstrate the proof. Let us prove the equality 


(9) {3, 4 | I, I, 5}, ima (4, - 5 | 3, 4}. 


The left side represents the number of distributions of 3 red and 4 
white balls 
R, R, R, W, W, W, W, 


among the three urns, A and B each holding 1 ball and C holding 
the other 5. Let us look at one of the possible distributions, say 


IR] [wl |RRwww] 
A B C 


Now we can just as well write this distribution by listing the balls 
in a row and putting under each one the urn into which it goes, 


R, W, R, R, W, W, W 
yy anes Sg ams GMa oven’ St 


This is just a list of 7 pairs of letters. The upper letters R and W 
are the names of the colors, the lower letters A, B, C are the names 
of the urns. We can interchange the roles of the letters and let 
A, B, C' be the names of colors, R and W the names of urns. If 
we now write the colors in the upper row, the urns in the lower, 
and rearrange the pairs according to urns, we have the scheme 


iar Se et oe ey ews 
R, R, R, W, W, W, W. 


(10a) 


(10b) 


Exactly the same pairs appear in (10b) as in (10a), only each has 
been inverted. But (10b) can be interpreted as representing a 
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distribution of 1 ball of color A, 1 of B, and 5 of C among the two 
urns R and W, of content 3 and 4 respectively. But this is one 
of the distributions which are counted by the right side of (9). 
Every other distribution counted by the left side of (9) will correspond 
to one counted by the right side, in the same way as (10a) cor- 
responds to (10b). This correspondence clearly works in the other 
direction, from (10b) to (10a), in just the same way. Therefore 
there is a complete correspondence, a “‘duality,’’ between the two 
problems represented by the left and right sides of (9). Since the 
two sets of distributions are paired off exactly, they must have the 
same number, and hence (9) is a true equality. 

In proving that the two sides of (9) are equal, we did not have 
to know the numerical value of either side. In this case, however, 
it is easy to list all possible distributions systematically and thus 
to find 

{3,4 | 1,1, 5}, = {1, 1, 5 | 3, 4}, = 4. 


This proof of (9) consisted merely in interchanging the names of 
the colors and the names of the urns. We do not need to go through 
the detailed proof of (8). The same interchange of colors and urns 
shows the duality of the two problems and therefore the general 
equality (8). 

8. The computation of the value of distributions in certain cases. ‘The 
computation of the value of {7,5,...|a,),-:-}, for arbitrary 
numbers r, 5,°°*, 4, b,:++* is quite troublesome and we shall not 
attempt it. The symbols that arose in paragraphs 3 to 6 are not 
of the most general sort. They all have the peculiarity that either 
all the numbers in front of the vertical bar or all the numbers 
following the bar (or both) consist entirely of 1’s. That is, either 
all the balls have different colors or each urn can hold just one 
ball. Because of the duality we need only consider the case where 
the balls all have different colors, and we may compute 


{1, be 74 Oe 2 es 


The subscript 2 reminds us that there are n balls and that all the 
urns together will hold just 7 balls, 


l+t1l+---+1l=a+b+e¢4+°-::=n. 
Since we are only considering symbols with 1’s in front of the bar, 


we don’t always need to write them in, and we can use the shorter 
notation, 


{1, Ly jayd, t, -* *} ye fae er}... 
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The numbers a, 0, c,-+*, representing the sizes of the urns, are 
subject to no restrictions other than that their total be n. Ob- 
viously we have 


(11) {t}n == 1 


for there is only one way to put all the balls into a single urn. If 
we replace the single urn NW of content n by two urns, NV, of content 
n—1 and WN’ of content 1, then one of the balls must be taken 
from N and put into N’. The remaining n — 1 balls are put into 
N,. The one ball for NW’ can be any one and, since they are all 
different, this gives us nm possibilities. ‘Therefore we have 


{n — 1,1}, =n. 
In the same way we prove 
(12) 40.4 -~-), = fa — 1, 1, b,c, - « +},. 


Here the urns B, C,--- of content 5,c,--: are left unchanged, 
but the urn A of content a is replaced by two urns, A, of content 
a — 1 and A’ of content 1. We can go from a distribution among 
A, B, C,-:-+ to a distribution among A,, A’, B, C,--- by taking 
any one ball from A for A’, putting the remaining a — 1 balls of 
A into A,, and leaving the balls in B, C,-:+ unchanged. Again, 
since the balls are all different, there are adifferent possible choices 
for the ball that is to be put in A’. Therefore there are a times as 
many distributions among A,, A’, B, C,--- as among A, B, C,-:-:- 
and that is exactly the meaning of (12). 

Now we can replace A, by two urns, A, of content a — 2 and 
A” of content 1, and will clearly have 


(@—1){a= 1, 1, ba epeefa ~ 2, A; 1, Viens 3p, 
In the same way we find 
(a — 2){a — 2, i, Li@, & ** 7), == fa — 8, Is 45 eee 


and a whole series of similar equations, of which the last is 


2{2, 1, at *, Lb, ¢,°°*}, =a, eae SS eee 
aoe ae ESE or a, 
fe a 


In this last step we have broken a down into a series of 1’s._ If we 
multiply (12) and all the equations that follow it together then both 
sides have the common factors 
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{a—1, L b, ¢,°° ie {a—2, i 1, b, bey at ne {2, 1, s ae 1,d,¢, Bs ‘}, 
\csnriaemstini iilemnaticnnsil 
a— 2 


and these can all be cancelled out without altering the equality. 
In this way we find 


eb) hdr 2)2) A 2th bydiix he ee Fig 4c, Ld, 0. ie 


In the product a(a — 1)(a — 2)--+2 we shall reverse the order 
and use the notation 4 


al==1-2-3--+(a@a—1)-a. 
Our result can then be written as 
(13) alla, 0, ¢,- * +), 4, Leateio ae “Pas 
Siecenencinnenisiammninaianas” 
a 


Exactly the same process can be used to replace the urn B by 
smaller ones. This gives 


b!{1, det? 3, Léeacads = {1, ae See 154, Se 
Saaaeiiainks nieceiieae 
a a b 
which can be combined with (13) to yield 
lalla; 6.675 lets Gey cnnckneic che 
dabei, epnantetanl 
a+b 


By treating C' the same way, and continuing through the rest of the 
urns, we finally find 


(14) alblel +++ ta, b, eps peegi, Tp o>): b),, 


where, obviously, the right hand symbol contains exactly n 1’s. 
If a = n in (18) then there is only one. urn, and (13) becomes 


Ri 1 1 
Therefore, using (11), we find 
(15) oe ie See 


Using this value in (14), we then have the formula 


n! 
(16) 1,18, Gyre ty Seth, +> +; Ea 
4 a! is read ‘‘a factorial’’. 
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for the computation of our symbol, at least in all the cases that 
arise from our examples. 

9. We can now return to the special examples. 

Example [. From (2) and (15) we have 


eee, tT me AE 1+, TY = ni, 


n 


that is, m persons can be seated at n places in n! different ways, or 
there are n! permutations of n different things. The factorials 
that appear in these formulas increase very rapidly. The first 
10 are: 


Li te 8 G! == 720 
Biee g Wi= ... 6,040 
Sf eo s!— 40,320 
4! ox 84 9i= 362,880 
5! = 120 10! = 3,628,800. 
Example II. From (3) and (16) we find 
32! 


Ss = ——_____ 
10! 10! 10! 2! 
for the number of different ways the cards can be distributed in 
a game of skat. This is indeed a large number. On computing, 
it turns out to be 
S = 2,753,294,408, 504,640. 

The reader may be interested in making the similar computation 
for the hands of bridge. 


Example III. Making use of the duality (8) as well as (5) and 
(16), we obtain 


PP) = —_ (atb+tc=n). 


If these are computed for n = 4, they can be used in (16) to give 
the expansion 
(x+y + z)o = 4+ yt + 24 

+ 4x8y + 4x8z+ 4xy® + 4y8z + 4xz3 + 4yz3 

+ 6x29 + 6x2z2 + 6y2z? 

+ 12x?yz + 12xy?z + 12xyz?. 

Example IV. Finally (7) and (16) give the value 
n! 
Hee 

. K\(n — k)! 


for the number of combinations of n things taken k at a time. 
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The sequence of squares 1, 4, 9, 16, 25, +--+ becomes less and less 
dense as we go further out. The gaps between the consecutive 
squares become longer and longer. Although many numbers are 
not squares, some of them can at least be considered as sums of two 
squares, for example, 13 = 9+ 4, 41 = 254 16, etc. But not 
every number can be written as a sum of two squares. If we try 
to express the number 6 as a sum of two squares, the available 
squares are 1 and 4, the only squares that are less than 6. Neither 
1+ 1 nor 4+ 4 nor 1+ 4 gives 6, so 6 requires at least three 
squares. In fact, 6 can be expressed as a sum of three squares, 
6—4-+1+1. The same procedure shows that 7 cannot be 
expressed as a sum of three squares, since the smallest number that 
will suffice is four and 7=4+1+1+1. For 8=4+4 4, two 
squares again suffice, 9 is itself a square, and we have 10 = 9 + 1, 
l=9+141, 12=9414141=44+444 ete, 

One would naturally expect that we would soon come to a point 
where four squares would no longer be enough, and that on con- 
tinuing, more and more squares would be needed. However, 
Fermat, who ranks with Descartes among the greatest mathematicians 
of the 17th century, proved the very surprising fact that every positive 
whole number can be expressed as a sum of at most four squares. 

Waring conjectured that a similar fact could be proved for cubes, 
fourth powers, etc. and he brought up the question as to how many 
cubes, fourth powers, and so on would be required. Because of 
this his name has been attached to this set of problems. The cubes 
are the numbers 1, 8, 27, 64,---. If we try to express the smaller 
numbers as sums of cubes, we see that 7, being the last number 
before 8, must be expressed entirely by means of l’s. Therefore 
7=1+1+1+4+1+1+1+1 requires 7 cubes. Similarly, 
15=>8+1+1+1+1+1+1+1 requires 8 cubes; and 
23= 8+8+4+1+4+1+1+1+1+1+41 requires 9. Before 
we reach 31 a new cube, 27, becomes available, and the whole 
situation is changed. In fact, 31 = 27+ 1+1+1+1 requires 
only 5 cubes. 

C. G. J. Jacobi had the computer Dahse compile a list showing 
the decomposition of each number into a sum of the smallest possible 
number of cubes. This list revealed that after 23, the next number 
requiring 9 cubes is 239. These were the only numbers of the 
entire list that require 9 cubes, and the list extended to 12000. 
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The numbers requiring just 8 cubes were found to be 15, 22, 50, 
114, 167, 175, 186, 212, 213, 238, 303, 364, 420, 428, 454, and 
then no more were found all the way up to 12000. More numbers 
were found to require 7 cubes, 7, 14, 21, 42, 47, 49, 61, 77, 85, 87, 
103, ---, 5306, 5818, 8042, but even this series appears finally to 
be coming to an end. Continuations of this empirical work have 
only added further confirmation. 

Such empirical work can prove nothing. It only serves to suggest 
that it is probably true that every number is a sum of at most 9 
cubes, and that probably every number from a certain point on 
can be expressed as a sum of at most 8, or perhaps even 7, cubes. 
The latter statement, that 8 cubes suffice from some point on, was 
first proved by Landau by the use of difficult mathematical methods. 
After this was established, Wieferich proved the former statement. 

Fourth powers appear to behave in the same way as cubes. The 
first few fourth powers are 1, 16, 81, 256,---. Now 15 requires 
15 fourth powers, 31 requires 16, 47 requires 17, 63 requires 18, 
and 79 requires 19. After this the new fourth power 81 intervenes and 
the picture is completely changed. The question would now be, do 
19 fourth powers always suffice? Much work has been done on 
this problem. Liouville proved that 53 suffice and this number 
was slowly forced down to 47, 45, 41, 39, 38, and then Wieferich 
obtained 37. However, these were all far from the hoped-for 19. 

The great German mathematician Hilbert attacked the general 
problem in a different way. He did not try to improve the previous 
results, but considered instead the whole set of problems connected 
with cubes, fourth powers, etc. He was able to prove at one stroke 
that not only for cubes and fourth powers, but for fifth, sixth, and 
all higher powers, there is a number that will suffice (like the 9 and 
37 for cubes and fourth powers). Obviously this number is larger 
for higher powers. 

Hardy and Littlewood in England used still different and highly 
complex methods to attack the problem. The fact that they 
showed that all numbers from a certain point on are sums of 19 
fourth powers should give an idea of the power of their methods, 
and this is just one of their many far-reaching results. We have 
already seen that among the smaller numbers at least one requires 
19 fourth powers. Hardy and Littlewood’s result states that there 
is some number W such that all numbers from WN on are certainly 
sums of at most 19 fourth powers, but this number WN, as it is given 
by their proof, is so enormous that they did not bother to actually 
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compute its value. In a sense this practically settles the case of 
fourth powers, since all one would have to do would be to test 
systematically all the numbers less than W to determine whether 
or not 19 fourth powers will suffice for every number. However, 
this number WN is so tremendous that such a testing would far exceed 
the ability of any computer. 

We have used considerable space in discussing the facts connected 
with this problem. This discussion should serve to give an idea 
of how empirical study can be used to help discover facts and to 
develop new theorems. We shall now attempt to give some idea 
of the methods connected with this problem, especially those which 
were used by Hilbert. Unfortunately the powerful methods of 
Hardy and Littlewood are much too advanced and complex to be 
included. Even the proofs that we shall mention are partially 
beyond the scope of this discussion, but it is possible to exhibit the 
ideas that are used. 

As always, we start with a simple case. The equation 
(a + )(a — b) = a? — B? is familiar from algebra. If one forgets 
it, it can easily be verified by actually multiplying out the left side. 
This equation is true no matter what two numbers a and 5 may be. 
An equation that is always true is called an “identity”. A somewhat 
more complicated identity is: 


(1) (a? + b?)(c? +. d?) = (ac + bd)? + (ad — bc)?. 
In order to verify this we remember the formula 
(% +)? = x7 + Qxy + 9? 
and use it to multiply out the right hand side. We obtain 


(ac? + 2achd + b?d?) + (a?d® — 2abde + bc?) 
= a*c? + a®d? + $%2 + 5242 4+. Qabcd — 2abed. 


The last two terms cancel each other and the rest can be grouped 
to give a®(c? + d?) + b?(c? + d?) = (a2 + b?)(c2 + a2), which is 
just the left side of (1). 

This identity yields a result of some interest: if each of two numbers 
is a sum of two squares then their product is also a sum of two 
squares. For example, 13 = 9+ 4 and 41 = 25 + 16 are both 
of this form. Then, according to (1), we find 


533=13 - 41= (37+ 2?) (5?+42)=—(3-5+42- 4)21(3-4—2-5)2— 
23? + 22, 
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and we have the product 533 expressed as a sum of two squares. 
Formula (1) can be used in the same way for any two numbers 
that are sums of two squares. : 

Euler, the great Swiss mathematician of the 18th century, discovered 
the following identity: 


(ay + ay + a3 + af) (bf + bf + 03 + 03) 
(2) os: (—4,b, + dg, +-4gb3+Gqb4)?+ (4,b-++9b, +45b,—a4b5)? 
+ (4,b3—gb4+-43b, + A4b_)*+- (44+ 4gb3 —a3b.+-a,b,) : 


This identity can be verified without difficulty if both sides are 
multiplied out, making use of the familiar formula 


(%y bg tg t%g) 224 HQ +45 +g + ayy + Brg 
2x g% 3 2HyX q+ WorXq+ WgX,. 


Formula (2) is similar to (1) in that it shows that if two numbers 
are each sums of four squares, their product is also a sum of four 
squares. Lagrange used this identity in a very beautiful proof of 
the theorem of Fermat that every positive whole number can be 
expressed as a sum of at most four squares. In the first place this 
remark shows that it is only necessary to show that every prime 
number is a sum of four squares, since it will then automatically 
follow for composite numbers. To discuss the prime numbers, 
Lagrange again made use of (2). For the moment we will accept 
the theorem of Fermat as established, and will make use of it. 
Lagrange’s proof is given at the end of this chapter. 

This theorem was used by J. Liouville in proving that every 
number is a sum of at most 53 fourth powers. He also made use 
of an identity: 


6 (apap tag +34)? 
(3) = (4% +4%2)4+ (4 +43)4+ (xXg+45)4+ (a +44)4+ (%_+4%4)*+ (x3 +44)! 
T (% 4g) 4+ (% —%g)4+ (%q—2Xg)4+ (xy Hq) 4+ (%q—4tg) 4+ (%3—%4)4. 
In order to verify this we first use the binomial theorem to expand 


(%y + Xq)4 = xt + 4xbxq + Gate? + 4x,x3 + x§ 
and 
(%1 — X)* = xf — date, + 6x2? — 4x,x3 + xf. 


Adding these, we have 
(%, ++ xq)4 ++ (4% — xq)4 = Qxt + Oxh + 12x22. 
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Using the corresponding formula for each parenthesis in the second 
row of (3) plus the parenthesis below it, we find that the right side 
of (3) has the expansion 

6 (xj bag bag hg) + 12 (xyrg pags thang tarry + axgng 19%). 
Now if the left side of (3) is also multiplied out it is immediately 
seen to have the same value. 

Liouville made use of this identity in the following way. Let n 
be any positive whole number. It must be proved that n is a sum 
of at most 53 fourth powers. He begins by dividing n by 6 and 
finding the quotient and remainder, n = 6x + y (if m is 29, the 
quotient x is 4, and the remainder y is 5). Here y will be one of the 
numbers 0, 1, 2, 3, 4, 5. At this point Liouville makes use of Fer- 
mat’s theorem for the first time. He uses it to show that x can 
be written as a sum of four squares, x = a* + 5? + ¢? + d?. Then 
the original number can be written 

n= 6x-+-y=6(a?+ b?+-c?+ d*) + y= 6a?+ 657+ 6c?+ 6d?-+-y, 
Now Liouville uses Fermat’s theorem again, applying it to a, }, 
c, d to obtain 
a=aq+@+ad+ d, 
b=B +R LR +E, 
c=q@teg+d+ a, 
d= di +d, +43 + dj, 
and therefore 
n= Ooh taht af tal)* + +++ + Od +dz+dR +d)? + 3. 
The identity (3) can now be used. Applied to the first expression 
on the right, it asserts that this expression can be written as a sum 
of 12 fourth powers. The same is true of each of the other three 
similar expressions. Thus far 4-12 = 48 fourth powers have been 
used and » must still be broken down into fourth powers. Since 
yis 0, 1, 2, 3, 4, or 5, it can be expressed as a sum of at most 5 fourth 
powers, each of whichis 1. That gives a total of 48 + 5 = 53 fourth 
powers. 


Lagrange’s Theorem states that every positive whole number WV 
can be written as the sum of the squares of four whole numbers, 
N=2+4+2+2+2. In proving this theorem, we shall make 
use of formula (2) of this chapter but will change the letters. If 
we take a = %, Gy =X, dg =%s, dy = %4, 1, = —Jy, bg =a, 


bs = Ja, bg = 4, we find 
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(xi bag tag tag) Oi bye thy) 
(2°) = (Hi + Xe Pq + X33 + Xa)? + (HQ — X09 + X34 — X4)3)” 
+ (X%1P3 — %3I1 + Xa Vg — Xo4)® + (MDa — X44 + X03 — 3). 


The proof can be broken up into a number of steps. We first 
prove: 

Theorem 1. If A and B can each be written as the sum of four squares, 
then so can the product AB. This follows directly from (2), since if 
A=xitx5+43 + x3 and B= 9? + y2 +92 +92, then (2) shows 
that AB is a sum of the squares of four numbers, each of which is 
clearly a whole number. This result will allow us to concentrate 
our attention on the prime numbers, since it shows that we need 
consider only the factors of a number. We shall not try to prove 
the whole result for prime numbers at once, but shall start with: 

Theorem 2. If p is a prime number greater than 2, then it is possible 
to find a whole number m such that 1 << m < p and such that mp can be 
written as a sum of four squares, mp = x? + x2 + x2 + x2. This would 
be easy to prove if we did not insist on 1 < m < f, since we could 
write 0-p = 0? + 0? + 07+ 0? or p- p= pf? + 0? + 0? 4 02. 
However, it will be important later to have 1 <m< fp. 

To prove theorem 2, we first notice that p is an odd number, since 
it is prime and greater than 2. We write down the numbers 


— 343 
OF 1% Bess. (= , divide each by f, and keep only the re- 


—— 1 
mainders. ‘This gives bs numbers r, each between 0 and 


p—1. For p= 11 we would write down 0, 1, 4, 9, 16, 25. 
Dividing each by 11, we find the remainders r to be 0, 1, 4, 9, 5, 3. 
In every case all of these remainders will be different. If two 
were equal we would have two whole numbers, x, > x,, between 0 


and 2 


1 ; 
, whose squares would yield the same remainder r 


when divided by ~. That is, we would have x7 =q4,p+r and 
x = 4,p +r. Subtracting, we find x? —x#2 = (q¢,—4,)p or 
(x, — %_) (4%, + %2) = (G4 — q2)p. Since p is a prime, it must divide 
either x, — x, or x, +s, but this is impossible, since x, — x, and 
x; +, are positive numbers less than fp. 

Now we take the remainders r, increase each by 1 and subtract 


from p. This again gives us numbers s between 0 and 
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p — 1, all of which are different. For p= 11 we would obtain 
10, 9, 6, 1,5, 7. At least one of the numbers s must equal a number 


of our previous set 7, since r uses up 


of the f numbers 


0, 1, 2,*-*, p— I, leaving over only 


numbers, while there 


are 


numbers s. For = 11 we find the numbers 1, 9, 5 


in both r and s. 


Let R = S be equal numbers from 7 and s. R is the remainder 


— | 
on dividing some x?, 0<¥x <*—, by p. S is obtained by 


dividing some number y?,0 <y < P , Increasing the remainder 


by 1 and subtracting from p. That is, x2 = 4p + R, 7? = gp +1, 
S=p—(r-+ 1). Adding these three equations, we have 

e+ P+ S=(q4+e+lp+R—1. Since R=S, we can 
write this as x2 + y? + 1=mp where m= gq +g+1. Also, 


since 0 S% = 


— 1 
and osyet— we have 0 < mp <= 


‘ datablee, P—*) : y— B41 pp apes p* 
awhaies eos eee 
< p*, and therefore0 <m< p. This proves theorem 2, since we 
have l <m<pand mp = x? + 92 + 124 02, 

For p= 11 we could take R= S = 5, finding x = 4, y= 4, 
421 421 124 (92 — 3.-]], However, if we take R = S = 1, we 
find x= 1, y= 8, 12+ 324 12+ 02?= 1-11. In this caseowe 
have been able to write = 11 as a sum of four squares. The 
method we have used may not always give us p as a sum of four 
squares, however, so we now prove: 


Theorem 3. If p is a prime number greater than 2 and uf m ts the smallest 
positive whole number such that mp is a sum of four squares, then m = 1. 
We already have m < p from 2. This smallest m cannot be even, 
for ifit were we would have x? + x2 + x2 + x? = mp, an even number. 
Then either all the x’s would be even, two would be even and two 
odd, or all would be odd. In the second possibility, we can suppose 
that we have numbered the x’s so that x, and X, are even, while x, 
and x, are odd. Then in every case (x, + x»), (x%, — Xq), (x3 + x4), 
and (x3 — x4) are all even numbers. We then have 
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(* a) + As) + (224s (t=8)' 
2 2 s 2 


1 
= Si+4+4+8) =p, 


and all the numbers whose squares are taken on the left are whole 
ges: : 
numbers. That is, p can be written as a sum of four squares, 


and the even m was not the smallest possible value, as we had 
supposed. 

We now know that the smallest m is odd and m < p. To prove 
m = 1, we suppose m > 1 and again show that it could be reduced. 
Since m is odd, we may suppose m => 3. Then we have 


(4) mp = xi + xk + x2 + 2, 

We divide each x, by m and obtain a remainder 7,, 0 <7, << m. 
m— | ; m 1 

9S =, we write J, = tp. If = Sry Sm — i], 


we write y, = 7, —m. In both cases we have x, = g,m + y, and 

m— | m— 1 | 
=», = 

7 3 


using (4), 


I; +3 +53 +5 =x +X + x3 +x — 2m(x191 + X92 + %39s + %494) 
(5) +m(git+e+@G+) = mp — 2m(xg, + 29s + %393 + %49,) 
+ m? (gi + 93 +493 + q) = man, 


where n is a whole number. Furthermore, we have n > 0, since 
if n = 0 we would have y, = 72 = 73 = 4 = 0. This would mean 
that each x is a multiple of m, and hence each x? is a multiple of m?. 
From (4) we would then see that mp is a multiple of m?, so p is 
a multiple of m. But this cannot be, since p is prime and 1<m<p. 


0 
We also have mn = yj +93 +92 +9? <4(" <cim?,”> and 
hence 2 < m. 


Multiplying (4) and (5), we find 


(6) mPnp = (xp + xg + 3 + xf) (9t +53 +32 +57) 
= the right side of (2’). 


Since y, = %, — q,ym, we also have, 


The first number whose square appears on the right side of (2’) is 
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a1 + XaJq + X%3I3 + Xa Ia 
= 1 (%1 — GM) + Xo(%_ — Gym) + Xg (xy — Gam) + X4(%_ — Gm) 
a xt ted ted a? — (xy, + X2e + 393 + ¥294) 
= mp — m(x9, + X2fo + %393 + %494) 
= M2, 


where z, is a whole number. The second number whose square 
appears on the right side of (2’) is 


Jo —X_J1 1 %3I4 — X43 
= Xy(X%_ — Ygm) — Xq(%1 — Gym) +X 3(%q — Gym) — x4(xg — Gym) 
= M(— XJo + X29, — X394 + X49s) 
—= Ma; 


where Z, is also a whole number. In a similar way, we find that 
the third and fourth numbers are mz, and mz,. Putting these in 
(6), we have 


minp = mid + mk + mid + mide 
mp=ot+ gt eat 2. 


Therefore np can be written as a sum of four squares, and we have 
already found 0<n<m. This shows that m > 1 was not the 
smallest possible value, as we had supposed. All that is left is 
m = 1, and theorem 3 is proved. 

For the case p= ll, 4, =4,%=—¢ 4-4, %. = 2 
we have »,; = I, 'y,= Ll % = 1, J), =0; #144472 
0-O= 3, 4%=3; 4:1—4-141T-0—0-1=— 2 
4-1—1-1+0-1—4-0= 382, z4,.=1; 4:0—0-1 ee 
I+] = 3z,, 2. ==.1 and finally 3? + 0° + 14 4 14=] |-3ie 

Theorem 4. If p is a prime number it can be written as a sum of four 
squares This is hardly more than a restatement of theorem 3. 
This is true if p = 2, since we have 2= 12+ 12+ 0?+ 0%, If 
p > 2 it is true by theorem 3. 

We finally come to Lagrange’s Theorem: 

Theorem 5. Every whole number n = 0 can be written as a sum of four 
squares. ‘This is true for n= 0 and n= 1, since we have 
0 = 0? + 0?+ 074 0? and 1= 124+ 0?+4 074 02. It is also 
true if nm equals a prime number by theorem 4. All that are left 
are the composite numbers. If m is composite, we can factor it 


into a product of primes, n = p,fop3°°* pf, where fy, po, Pg, ***s Pe 
are prime numbers, not necessarily distinct. Now, by theorem 4, 
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p, and f, can be written as sums of four squares. Then by theorem 
1, the product ,f, can also be written as the sum of four squares. 
Again by theorem 4, p, can be written as a sum of four squares and, 
by theorem 1, so can the product f,.,. Continuing in this manner, 
we finally find that m can be written as a sum of four squares. 


10. On Closed Self-Intersecting Curves 


1. The curves that we shall discuss in this chapter are of a special 
kind. Although they may be quite complicated, they must satisfy 
certain conditions. First, they must be traversible in a single 
passage. That is, one should be able to draw the whole curve 
with a single stroke, starting at a given point and never taking the 
pencil from the paper until the curve is completely drawn. Second, 
they must be closed. That is, when one draws the curve he should 
be able to start at a given point, trace out the whole curve, and return 
to the original starting point just as the curve is completed. Finally, 
the curves may cross themselves any number of times, but when 
they do, they shall pass through such a crossing point only twice. 
The examples shown in Figs. 26 and 27 satisfy these conditions, 
that in Fig. 28 does not. The type of crossing point that is allowed 


7 


Fig. 26 Fig. 27 Fig. 28 


(Figs. 26 and 27) is called a double point. The crossing of Fig. 28 
is not allowed and is called a triple point}. In tracing out the 


* Although the material presented here has many connections with the second 
chapter, the fundamental ideas are essentially different, and the reader will do 
well to consider the two chapters independently. In the second chapter a network 
of curves was given and ways to traverse it were discussed. Here the given curve 
is a ‘‘closed route”? and there is no question of how it can be traversed. Also, 
in the words of the second chapter, the present curves have junctions of only 
the fourth order, now called double points. Finally, at a double point there is 
no choice as to how the parts of the curve can be connected; the four parts coming 
together at a double point must be connected in two pairs of opposite parts, 
since the curve is supposed to cross itself there. 
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complete curve, one clearly passes through each double point 
twice. If we designate each double point by a number then we 
can show the order in which we pass through them by means of a 
series of numbers. For example, Fig. 26 has the order 1 2 2 1, 
while Fig. 27 has 1 2 3 1 2 3. Since each double point is 
passed twice, each number must appear twice in the series. Gauss 
noticed that it is not true that just any series in which each number 
appears twice will represent the order of double points on a curve. 
In the case of two double points, we have encountered the ordering 
1 2 2 I, but there is no curve with the order 1 2 <3 @, 
This can be verified by trial and error for this very simple case. 

The principal result of this section will be the theorem that zn 
the sequence each double point appears once in an even place, once in an 
odd place. Expressed a little differently, this asserts that the two 
places where the double point appears are separated either by an 
even number of places or by none at all. This theorem immediately 
shows that 1 2 1 2 is impossible since there is just one place 
between the two I’s. 

2. In order to prove the theorem, we consider an arbitrary 
double point Q of the curve A (Fig. 29). If we start from Q and 
follow the curve A, we will eventually return to Q. When we first 
return to Q we will have traversed a part B of the whole curve A. 


This part can be only a part of the whole curve, since there are four 
segments of the curve radiating from Q and we have traversed only 
two, one when leaving Q and one when returning. If we continue 
from Q along the curve A we traverse the remainder, C, of A. Both 
B and C are closed curves, each with a sharp corner at Q. Although 
Q is a double point of A it is not a double point of B or C (neither 
curve crosses itself there), and B and C touch but do not cross each 
other at Q. We must prove that on traversing B from Q back 
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to Q we pass through double points an even number of times.? 
The double points of A that are on B are, besides Q, the points 
where B crosses itself (double points of B) and the points where B 
and C’' intersect. 

In traversing B one certainly passes through each double point 
of B twice. Therefore all these double points together contribute 
an even number to our count. At each intersection of B and C’ 
two paths cross, one a part of B, the other a part of C. In traversing 
B one never enters the path belonging to C, so one passes through 
each intersection of B and C only once. Now we need only show 
that B and C intersect in an even number of points. 

3. Without changing the intersections of B and C we can slightly 
deform the curves near Q in such a way as to break them apart 


Fig. 30 


(Fig. 30). The two curves now may cross each other at various 
points, but nowhere do they touch each other without crossing. 

Now we have to show that two such curves either do not intersect, 
or intersect in an even number of points. In order to show 
this we shall deform one of the curves, say C, a step ata time. At 
each step we will remove a double point from C' and thereby make 
the situation less complex. As at the point Q, we shall always 
deform the curve so little that we won’t disturb the intersections 
of B and C. 

Let P be a double point of C. The curve C splits into two closed 
curves D and E at P in the same way that A split into B and Cat Q. 
The curves D and E touch each other at P. If we mark a certain 
direction on C, the direction along which we traverse it, then D 
and E are automatically given a direction (Fig. 3la). Now, starting 
at P, we traverse D in the marked direction, return to P, and then 
traverse E back to P in a direction opposite to that marked. By using 
the opposite direction on EF we have eliminated the crossing at P. 


2 An even number of times, not of double points. In Fig. 26, with the order 
1 2 2 1, we pass through the single point 2 between 1 and 1, but we pass through 
it twice. 
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We have traversed C in a single passage that went through P twice, 
along two paths which have corners at P but which do not cross 


igs 


Fig. 31b Fig. 31c 


there. These two paths merely touch each other at P so we can 
pull them slightly apart and round off the corners. We have then 
replaced C by a curve that does not have a double point at P (Fig. 
31b). ‘The new curve has one less double point than the original 
curve C.. When deforming the curve near P, we must do it so 
slightly that no other double point of C and no intersection of B 
and C' are disturbed. 

We repeat this whole procedure for each double point of C and 
finally arrive at a curve C’* that is free of double points (Fig. 31c). 
The two curves C and C* are nearly the same. They differ only 
near the double points of C. Of importance to us is the fact that 
both curves intersect B in the same points. 

4. A closed curve that is free of double points encloses a region 
that we call the “interior” of the curve. This fact appears rather 
obvious, and we shall accept it on the basis of intuition. The part 
of the plane that is neither on the curve nor the interior is the 
“exterior,” and it is separated from the interior by the curve itself. 
If such a curve is cut by another curve at some point, then the 
second curve must pass, at that point, from the interior to the 
exterior, or vice versa. 

We can now show that the curves B and C either do not intersect, 
or intersect in an even number of points. Without altering the 
intersections, we can replace C by C*, which has no double points. 
It may happen that B is entirely in the interior of C*, in which case 
there are no intersections. If B lies entirely in the exterior of C%*, 
then again there are no intersections. Finally, B may lie partly 
in the interior of C*, partly in the exterior. In that case let T be 
any point on B in the exterior of C*. If we start at T and traverse B, 
there must be a first time that we enter the interior of C*. This 
is one point of intersection of B and C*. Since B is closed, we must 
finally arrive back at JT in the exterior of C*; that is, we must go | 
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from the interior to the exterior, a second intersection. It may 
happen that B enters the interior several times, but each time B 
enters the interior must be followed by a time that B leaves the 
interior, since B finally returns to the starting point T in the exterior. 
In all cases therefore, C* and B (and hence C and B) either inter- 
sect in an even number of points or they do not intersect at all. 
As we saw at the end of § 2, this is enough to complete the proof 
of the theorem. | 

5. This theorem, which asserts that each double point of A 
occurs once at an even place and once at an odd place, can be 
expressed in a somewhat different way. We will think of A as the 
projection of a curve that is drawn in space, with the double points 
of A representing places where one part of the curve passes over or 
under another part of the curve. If the curve were a road the 
double points would represent underpasses. Now we would like 
to arrange the curve so that on traversing it we alternately take the 
upper and lower roads of the underpasses. If we start on the upper 
road of some particular underpass and traverse the curve, then we 
should go under the first underpass, over the next, etc. Thus the 
whole arrangement is completely determined. The only question 
is whether we can completely arrange it that way. For at some 
time we shall return to an underpass (a double point of A) that we 
have already crossed. Might it happen that we had originally 
crossed, say, over at this underpass and that we should again cross 
over in order to alternate? No, for our theorem states exactly that 
such a contradiction cannot occur. According to it we must have 
gone through double points of A, underpasses, an even number of 
times before returning to the original underpass. Since we started 
over the original underpass we went under the first, over the 
second, +--+, over the last (because of the even number of times). 
Therefore on returning to the original underpass we should go under, 
and this is just the way which is left open. 

Figs. 32 and 33 show the curves of Figs. 26 and 27 drawn in this 
way as projections of curves in space. At each double point they 
show which part of the curve passes over the other part. These 
space curves are “knots.” A knot of this kind, in which we alter- 
nately go over and under when traversing its projection, is called 
an “alternating” knot. Strictly speaking, Fig. 32 is not a proper 
knot, for a loop of string twisted in that shape could be pulled out 
into an unknotted circle. However, Fig. 33 represents a proper 
knot. It cannot be changed to a circle without cutting it. 


65 


FACTORIZATION OF A NUMBER INTO PRIME FACTORS 


Fig. 34 shows that not all knots are alternating knots, at least 
without being deformed in some manner. The fact that there are 


Fig. 32 Fig. 33 Fig. 34 


non-alternating knots perhaps most clearly shows that our theorem, 
in either of its formulations, is certainly not trivial. 


11. Is the Factorization of a Number into 
Prime Factors Unique? 


1. Starting with any given number, one can keep splitting it 
into factors until one finally has only prime factors. For example, 
60 may be factored as 6-10, 6 as 2- 3, and 10 as 2-5, so that we 
finally have 

60:42 2° 3° 2-6 


and these four factors are all prime. 
Still using the example 60, we could first have factored it as 
60 = 4°15, 4= 2-2, 15=3- 5, from which we have 


60: 2 +2- 3:5. 


The same primes appear in both these factorings and each prime 
appears the same number of times. Writing the primes in order 
of size, we have 

60 = 2 > 3-5 


ee 


in both cases. The fact that we get the same result in both cases 
seems very obvious because we are so used to it. In arithmetic 
we have learned to take it for granted that if we factor a number as 
far as possible into prime factors, we will always obtain the same 
factors no matter how we factor it. 

That statement is true, but is it really so obvious? Consider a large 
number. It would take considerable work to factor 30031. After 
many trials we might discover that it can be factored into 59 - 509, 


66 


FACTORIZATION OF A NUMBER INTO PRIME FACTORS 


and that these factors are prime. Now on what basis can we honestly 
say that it is obvious that further trials will not reveal some other 
factorization which is entirely different? 

Such a question goes counter to all the ideas we have learned 
to accept concerning the prime numbers that “go into” a number. 
The aim of § 2 and § 3 will be to show that these intuitive ideas 
have no real basis. Then, having shown that there really is a 
problem, we shall devote the remainder of the chapter to proving 
that the factorization into prime factors 1s indeed unique. 

2. In order to free ourselves of preconceived ideas we shall con- 
sider an unfamiliar system of numbers. These are the numbers of 
the form a + bV6, where a and b are ordinary whole numbers. 
For example, 12 + 5V 6, /6 — 2, and 3/6 are such numbers, 
while 2+ 3V/12 is not. The ordinary whole numbers are not 
excluded; in fact they merely have 6 = 0. They form a part of 
our number system, so this new system is an extension of the set of 
ordinary whole numbers. 

Computation with these numbers is carried on just as one would 
naturally expect. The processes are all familiar from algebra. The 
way to add or subtract two of these numbers is made clear by the 
example 


(3 + V6)+(5 + V6) = 8 + 2V6. 


The following multiplications, carried out by the usual rules of 
algebra, should show how to multiply any two numbers: 


(3+ V6)(3— V6) =9-—6=3 

(V6 + 2)(V6— 2) =6—4=2 

(3 + V6)(V6 — 2) = 3V6—6+46-— 2V6= V6 
(3 — V6)(V6 + 2) = V6 

(3 + V6)(2 + V6) = 124+ 5V6 

-— V6)(V6 — 2) = — 12 + v6 


We do not need to say anything about division. Sometimes one 
number will divide another, sometimes not, just as in the case of 
ordinary whole numbers. 


In our number system, 6 can be factored into V 6V6 as well 
as in the usual way, 


(1) 6=2-3= Vb6vV6. 
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Here, apparently, 6 can be factored in two different ways. This 
leads us to remember the question as to whether 30031 has a fac- 
torization that differs from 59-509. Apparently (1) presents an 
analogous situation. 

However, this case can be cleared up in a very natural way. 
The numbers 2 and 3 are ordinary primes. They cannot be 
factored in the ordinary number system, but they can be factored 
in the new system. In fact, from our multiplication examples 
we have 


2= (V6+2)(V6— 2), ~ s= (3 4 Veja — +76). 
Therefore, continuing with our factorization of 6 = 2-3, we have 
(2) 6 = (V6 + 2)(V76 — 2)(8 + V6)(3 — V6). 
The two factorizations (1) are merely (2) with different pairs of 
factors combined. The first of these has the first two factors and 
the last two combined. The second factorization has the first and 
fourth factors combined as well as the second and third. Obviously 


there are other possible combinations. For example, if we combine 
the first and third factors and the second and fourth factors, we have 


6 = (12 + 5V6)(— 12 + 5V6), 


and this factorization can be verified by direct multiplication. 

This case is not different from what we are accustomed to. In 
clearing it up we did not have to know that the factors in (2) are 
prime. By a prime number, we now obviously mean a number 
that cannot be factored in our system. Actually it would not be 
difficult to show that the four factors are prime. 

3. Now we turn to still another number system. This system is 
the set of numbers of the form a + bV <a where a and 6 again 
are ordinary whole numbers. In this case we shall again find a 
situation like (1), but this time we will not be able to explain it 
away as we did with (2). Computations can be made in this system 
just as easily as in the system a + bV6. The rules of algebra are 
used exactly as they were before. 

Corresponding to (1) we now have 


(3) =2-3= Vee 


In analogy with the other case, we shall attempt to factor 2, 3, and 


V—6. This time however it will turn out that they are primes 
in this system, that we cannot factor them. 
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In our discussion it will be convenient to use the idea of the “norm”? 
ofa number. The norm of the number a + bV — 6 is the product 
of that number with a — bV— 6, 


N(a+bV —6)=(a+bV —6) (a—bV —6) =a?+ 682. 
In other words, in our system, the norm of a number is the product 


of that number and the number that is obtained by replacing 


V—6 by —V—6. The norm of a number is always a positive, 
ordinary whole number. Also the norm of the product of two 
numbers is equal to the product of their norms. For, according 
to the rule, we have 


N(a + bV—6)(c + dv —6) 
=[(a+bV —6)(c+dvV —6)] [(a—bV —6) (c_dV —6)], 
and these four factors can be reordered to give 
(a + bV—6)(a — bV —6)(c + dV —6)(c — dV —8). 
Pairing the terms, this is exactly 
N(a + bV—6)M(c + dv —6), 


according to the rule. 
If 2 could be factored into two factors in our system, we would 
have 
2 = (a+ bV—6)(c + dV —8), 


and therefore 
N(2) = N(a + bV—6)M(c + dV —8). 


But the norm of 2 is W(2) = (2 + 0V—6)(2 — 0V—6) = 2- 2=4, 
sO we would have 


4 = (a? + 66%)(c? + 6d2). 


That is, 4 would be factored into a product of two ordinary whole 
numbers, each of the form x? + 6y?. There are only two ways in 
which 4 can be factored using ordinary numbers. Either both 
factors are 2, or one is 4 and the other 1. Neither helps us here, 
since 2 cannot be expressed in the form x2 + 6y? and 1 is of this 
form only with x = 1, y= 0. Therefore the only way that 2 can 
be split into two factors in the system is for one factor to be 


b+ 046-26 =: b- We- don’t sconsider this as a factorization in 
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this system any more than we would consider 5 = 1:5 as a fac- 
torization in the ordinary number system. 


In exactly the same way one can recognize that 3 and V— 6 
are primes in the system. Instead of the norm 4, the norms 9 and 
6 would have to be split into factors of the form x? + 6y?. 

We have now proved that (3) represents two different factorizations into 
prime factors of the number 6 in our system. If such a thing can occur 
in this system, then it 1s certainly not obvious that it cannot occur in the 
ordinary system. If there were any basis for asserting that it is obvious 
that factorization is unique in the ordinary number system, then, 
on the same basis, we could assert that it is obvious in every system. 
But we have found a system in which, far from being obvious, it is 
not even true. We shall see that factorization is unique in the 
ordinary system, but thatit isa particular property of thatsystem. In 
proving this, we will have to use particular properties of the system. 

It is noteworthy that the Greek mathematicians recognized this 
problem and felt the need for proving it for the sake of logical 
completeness and clarity, apparently without the aid of an example 
such as ours. The unique factorization theorem is proved in Euclid 
but it is stated somewhat differently, without the use of modern 
notation. Beside the difference in formulation of the theorem, the 
proof given by Euclid is different from the one we shall use. 

4, The number 30 is a multiple of 3. It is also a multiple of 5. 
This fact is expressed by saying that 30 is a “common multiple” of 
3 and 5. In general, a common multiple of two numbers is a number 
that is simultaneously a multiple of each of the numbers. No matter 
what the two numbers are, they certainly have at least one common 
multiple, for if the numbers are a and 6 their product ab must be a 
common multiple. For 3 and 5, the product 3 - 5 = 15 is a common 
multiple, as is 30. ‘The number 30 can be recognized as a common 
multiple of 3 and 5 by the fact that it is twice their product, 15. 
In the same way it is clear that every multiple of ad is a common 
multiple of aand b. Therefore two numbers always have an infinite 
number of common multiples. 

The product ad and all its multiples together do not necessarily 
give us all the common multiples of a and 6. For example, the 
multiples of 10-15 are 150, 300, 450, ---, while the common mul- 
tiples of 10 and 15 are 30, 60, 90, 120, 150, 180,---. Clearly 30 
is the smallest number that is a common multiple of 10 and 15. 
There is always a smallest common multiple of any two numbers 
a and 4, for one needs only to test each number from 1 to ad to decide 
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which are common multiples. There will always be at least one com- 
mon multiple, since ad itself is one. Among these common multiples 
there will be a smallest one, called the least common multiple of a and 6. 

We first prove 

Lemma 1. Every common multiple of two numbers is a multiple of their 
least common multiple. For a= 10, b = 15 this asserts that the 
multiples of 30, that is 30, 60, 90, ---, are all the common multiples 
of 10 and 15. This can easily be verified in this case and in any 
other particular case. However, we must prove it im general. 

The proof depends on the simple fact that the difference of two 
common multiples of a and ) is again a common multiple of a and 6. 
For the difference of two multiples of a is again a multiple of a; 
this was used and discussed in Chapter 1. The same is true with 
regard to b. Therefore, if each of two numbers is a multiple of 
both a and 3, then their difference is also divisible by both and hence 
is a common multiple. 

Let m be the least common multiple of a and 6 and let WV be any 
common multiple. Then, by what we have just seen, NV — m is 
also a common multiple of a and 5. If we again subtract m, we 
find that the same is true for V — 2m. Continuing to subtract m, 
we see that 


N — m, N — 2m, N — 3m,::: 


are all common multiples of a and 5. Since m is the least of all 
the common multiples, the first number NW — m is certainly not 
negative. The same may be true for some of the following numbers, 
but eventually the numbers must be negative. Suppose that 
N — xm is the last of these numbers that is positive. It is a common 
multiple of a and 6 and is not greater than m, since subtracting m 
gives the next number, which is no longer positive. Since m is 
the least common multiple, the only possibility is that N— xm = m. 
Therefore N = xm + m= (x + 1)m is a multiple of m. 

5. We can also speak of ‘“‘ccommon divisors’ of two numbers. 
A number ¢ is a common divisor of a and 0 if it divides both a and 5 
exactly. According to lemma 1, the product ab, which is a common 
multiple of a and 4, is a multiple of the least common multiple m. 
We can now prove 

Lemma 2. The quotient of the product of two numbers a and b divided 
by their least common multiple, that is, the number 

ab 


d= —, 


m 
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is always a common divisor of a and b. 
From the equation for d we have 


d— => 
a 
and m/a is actually a whole number since mis a multiple ofa. There- 
fore b is a multiple of d or, in other words, d is a divisor of 6. In 
exactly the same way, d is a divisor of a and hence a common 
divisor of a and 6. 

6. We can now prove a theorem from which we can immediately 
deduce the unique factorization theorem. We prove 

Theorem: If a prime p divides the product xy of two numbers x and I; 
then p divides x or y, that is, it divides at least one of the factors. 

We consider the least common multiple m of p and x. On the 
one hand the product xy is a multiple of p by hypothesis, and it 
is obviously a multiple of x. Therefore it is a common multiple 
of p and x and, by lemma 1, it is then a multiple of m, 


(1) xy = hm. 

On the other hand, by lemma 2, the number 
px 

2 4 ee 

(2) : 


is a whole number and a common divisor of p and x. A divisor of 
a prime p can be only 1 or p. Therefore either d = p or 424, 
In the first case d = p is a divisor of x, so p divides the first factor 


x of xy. In the second case d = 1, and then (2) becomes 1 = im 


or m= px. Therefore, in virtue of (1), xy = h(px). We can cancel 
the factor x to obtain _y = hp. In this case p divides the second 
factor y of xy. In either case p divides at least one of the factors. 

From this we have the 

Corollary: If a prime divides a product of several numbers, then it divides 
at least one of the factors. 

For if it divides, say, xyz, then it either divides x or MS AE AL 
divides the latter then it divides either y or z. In any case, it 
divides one of the factors. 

7. The unique factorization theorem follows at once. If we 
have two factorizations 


N = pqrs+++ = PQRS--- 
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of NV into prime factors, then p divides WN. Therefore p divides the 
right hand product. By the corollary it divides one of the prime 
factors. But if one prime divides another prime, th- .wo must be 
equal because of the definition of a prime. Therefore p must occur 
somewhere among the primes on the right hand side. In the 
same way q and all the other primes on the left must appear on the 
right. Since the left and right sides are interchangeable, all the 
primes on the right must appear on the left. In other words, the 
two factorizations contain exactly the same primes. 

Now we have only to see that each prime appears on both sides 
the same number of times. If f appears a times on the left and A 
times on the right we have the factorizations 


Ae pq?r’ wae piq?r® oe 


If a and A were different one of them (say A) would be larger. 
We could then divide by f% to obtain 


N 
SS Gtr’ es. 


This would then represent two factorizations of M with p explicitly 
entering on the right but absent on the left. We have just shown 
that in two factorizations of any number the same primes must 
appear in both factorizations. In particular this must be true for 
M, so a and A could not have been different. Therefore we have 
a =A and, in the same way, b= B,c=C,---. That is, each 
prime appears in each factorization the same number of times. 


Quite naturally one will wonder why this same proof does not 
hold in the number system a + bV— 6 of § 3. In fact, nearly 
all of the proof can be carried over into that system. The one 
part that cannot be carried over is lemma 1. This lemma must 
therefore be the essential step in our proof. 


12. The Four-Color Problem 


1, In 1879 Cayley discussed the following problem. A map is 
usually printed in several colors in order to distinguish between the 
different countries. It would be best if each country were printed 
in a different color, but this is too costly. Instead it is customary 
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to use as few colors as possible, being careful that countries are always 
differently colored when they are next to each other. Fig. 35a 
represents the map of an island that requires three colors, blue for 
the sea and two colors for the two countries. Fig. 35b requires 
four colors. ‘The three countries all touch the sea so none can be 
the same color as the sea. Since each country touches every other 
one, they require three different colors, making a total of four 
colors. Fig. 35c shows that four colors may be required even if 
we disregard the coloring of the sea. Here the inner country takes 
over the role of the sea in Fig. 35b. Fig. 36 again requires only 


eh se, 


bi By i CE 
Fig. 35b Fig. 35c Fig. 36 Fig. 37 


the three colors a, 5, c, while the more complicated map of Fig. 37 
can be colored with four colors. It would be natural to expect 
that more complicated maps would require more and more colors. 

Many maps have been drawn, but regardless of how complicated 
they were, none that requires more than four colors has ever been 
found. On the other hand, no one has ever been able to prove 
that four colors will always be enough for every conceivable map. 
This is again a problem that can easily be proposed and understood 
without any mathematical preparation, but it is still unsolved. 

However, it has been proved that every map can be colored with five 
colors. Whenever we say that a map can be colored we mean it in 
the sense that no two countries that have a boundary in common 
are to be given the same color. However, two countries may have 
the same color if they merely touch at a corner (as in the coloring 
of a checkerboard). Furthermore, by a country we mean a single 
piece of land and not a political subdivision made up of several 
separate parts. 

The main purpose of this chapter will be to prove this fact, that 
every map can be colored with five colors. In the proof we shall 
assume that the map represents a single island. If each single 
island and the sea can be colored with five colors, certainly a map 
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consisting of several islands can be colored by merely using exactly 
the same colors in each island. 

2. As a first preliminary to the proof we shall prove Euler’s 
theorem. ‘This theorem is concerned with the number v of vertices 
(corners), f of faces (countries), and e of edges (boundaries) in an 
arbitrary map. It is a general theorem and has important applica- 
tions other than the particular one at hand. This theorem, which 
was discovered by Euler but was already known to Descartes, 
asserts that 


(1) vt f=e+ 2. 


In Fig. 36, for example, there are 8 vertices (i.e. points at which at 
least three countries come together), 6 faces, and 12 edges (each 
edge extends from one vertex to the next), and we have, in fact, 


ot o 


In order to prove the theorem we shall discard for the moment 
the idea of a map and instead we shall think of the figure as represent- 
ing a system of dikes and fields. ‘The edges are now dikes separating 
the fields represented by the faces, and the outer area (originally 
the sea of the map) is covered with water. We now think of breaking 
down one dike after another until all the fields are under water 
(Fig. 38). In doing this it is not necessary to destroy all the dikes. 
Any dike that already has water on both sides of it can certainly 
be left. If we only break dikes that have water on just one side, 
then at each step we shall destroy one dike and flood one more field. 
The outer region (corresponding to the sea in the map) was under 
water at the start, so there are exactly f — 1 fields to be flooded. 
Since this process of flooding can certainly be carried out until all 


Fig. 38 Fig. 39 Fig. 40 


the fields are flooded, we shall finally have destroyed exactly f — 1 
dikes. 

We now wish to consider the system of dikes that has been left 
intact. 
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I. One can walk dry-footed along the dikes from any vertex to any other 
vertex. Before any dikes were destroyed this could certainly have 
been done, since we supposed at the start that the map represents 
a single island. Suppose that in the course of flooding the fields, 
the destruction of some dike AB (Fig. 39) would cut the system 
into two completely separated islands. If AB were destroyed it 
would be impossible to walk along dikes from A to B. That is, 
water would completely surround each of the two systems. Therefore 
there must be water on both sides of AB before it is destroyed, and it 
was explicitly stated that such a dike should not be destroyed. 

II. Jf one sends a messenger from any vertex P to any other vertex Q, 
then there 1s just one path available to the messenger. For if there were 
two different paths from P to Q, then they would surround some 
area (Fig. 40). The ring of undestroyed dikes making up the two 
paths would keep this area dry, contrary to the fact that all the 
fields are flooded. 

If we keep the starting point P fixed, there is just one path leading 
to each vertex. On each path there is a last edge that is passed over 
just before reaching the vertex. This edge is completely determined 
by the vertex. Therefore we have a correspondence between edges 
and vertices; corresponding to each edge is its end point. There 
are then just.as many such end points as undestroyed dikes. The 
starting point P is not an end point, so the number of undestroyed 
dikes is v — 1. In all there are f — 1 destroyed dikes and v — 1 
undestroyed. ‘Therefore the total number e of dikes is 


e=(f—1)+(0—)). 


Euler’s formula (1) follows at once if we remove the parentheses 
and transpose the numerical term. 

3. As a final preliminary to the proof of the five-color problem 
we shall show that zt will be enough if we prove that five colors will color 


Fig. 41a 


every map in which no more than three countries meet at a vertex. If we have 
a map in which more than three countries meet at some vertex 
(Fig. 41a), then we can draw another map (Fig. 41b) which is an 
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exact copy of the original map except that one small new country 
has been formed around that vertex. This new map has one more 
country and several more vertices than the original, but only three 
countries meet at each new vertex and the vertex at which more 
than three countries meet has been eliminated. If we do the same 
thing for each vertex at which more than three countries meet, we 
will finally have a new map without any such vertices. Now if we 
can show that five colors are enough for every map in which no 
more than three countries meet at any vertex, five colors will be 
enough for the particular map we have just drawn. A possible 
coloring is shown by the letters in Fig. 41b. We can then color 
the original map by using the same color for each country as was 
used in the new map, disregarding the small countries that were 
added (Fig. 41a). It does not violate the rules if two countries 
that meet at a vertex but not along a boundary have the same 
color. 

A vertex is a point at which at least three countries meet, but we 
have seen that we need consider only maps in which no more than 
three countries meet at a vertex. Therefore we need consider only 
maps in which exactly three countries meet at each vertex. 

4, We are now ready to turn to the proof itself. We shall 
consider the number of vertices on the boundary of each country. 
If a given country has no vertices or just one vertex, then it has only 
one neighboring country and we can give it any color except that 
of its one neighbor. These countries can cause no difficulty, so 
we shall disregard them and assume that none are present in the 
rest of the proof. 

Let f, be the number of countries with just 2 vertices, f, the 
number of countries with 3 vertices, etc. Then the total number f 
of all countries is the sum of the numbers of each kind, 


(2) Fe a 

The f, countries with 2 vertices each have 2 boundaries, that is, 
2f, boundaries altogether. The fg countries with 3 vertices each 
have 3 boundaries, 3f, boundaries in all. And so on. This count 
will finally take into consideration all of the boundaries, but it will 
count each one twice, once for each of the two countries that are 
separated by it. Therefore we have 


(3) 2¢ = 2, + 3fs + 4f, + °° 
We can count the vertices in the same way. Since just three 
countries touch at each vertex we obtain 
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(4) 80 = 2f, + 3fe t+ 4fa to’ 
From (3) and (4) we have 
(5) 3u = 2e. 


Euler’s formula (1) multiplied by 6 is 
6v + 6f = 6e + 12 
and by (5), this is equal to 
9v + 12. 

Therefore we have 

6f= 3v+ 12 
or, because of (2) and (4), 

6(fatietht+::)=(Ae t+ s+ 44 + °°) +12, 

which simplifies to 


(6) 4f,+ 3fp+ Ath=l+ht+ Bet: 

From this we can show that in every map in which just three countries 
meet at each vertex, there is a country having less than 6 vertices. For if 
there were no such country, there would be no country with 2 
vertices. Therefore f, = 0. In the same way, f/f; =f, =f; = 0. 
Therefore the left side of (6) would be 0, while the right hand side 
would be at least 12. The maps of Figs. 35a, 35b, 36, and 37 
have no countries with more than 6 vertices, so the right side of (6) 
is 12 for all these cases. On the other hand, we have 


in‘ Fig. 38a°° °FZ= 3, “f,=0, “j-—= a -s 
in Fig. 35b: fg =0,° fg = 4 “fee a 
in Fig. 36: L,=%, f=, . ae 
in Fig. 37: ff =0, fA=0, fa=Q f,—F. 


In each of these examples there is only one of the numbers /; that 
differs from 0. Fig. 35c is different since it has f, = 0, fs = 2, 
J,-=.3, and_all the rest are 0. 

Now we know that in our map at least one country has less than 
6 vertices. It may have 2, 3, 4, or 5 vertices, and we must take 
up each of these four possibilities in turn. 

I. There is a country with 2 vertices. This country has just two 
neighbors. We can think of removing one of its two boundaries 
(the dotted one in Fig. 42). The new map has f — 1 instead of 
f countries. Let us suppose for the moment that this new map 
with fewer countries can be colored with only 5 colors. We can 
call a the color of the country formed by the original country and 
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its neighbor when we removed the boundary. The color of the 
other neighbor can be called 4. If we replace the boundary, then 
the original country with 2 vertices can be given the color c, since 


ey 


Fig. 42 Fig. 43 


this country has only the two neighbours with colors a and J, leaving 
c,d and eavailable. Therefore, if the new map with f — 1 countries 
can be colored with 5 colors, then the original map with f countries 
can be colored with 5 colors as well. 

II. There is a country with 3 vertices. Let L be the country and 
L,, L,, Lz its three neighbors (Fig. 43). We remove one of the 
boundaries of L and suppose that the new map with f — 1 countries 
will require no more than 5 colors. Then, on replacing the bounda- 
ry, we need only give L a color different from the colors of L,, L,, Ls. 
That leaves us 2 colors, either of which can be used for L. 

III. There is a country with 4 vertices. Arguing in the same way, 
we see that here at least one of the 5 colors is left for L, since its 4 
neighbors can use at most 4 different colors. However, a new kind 
of difficulty can arise in this case. A country such as L, of Fig. 44 
might touch L along two different boundaries. If we remove one of 


Fig. 44 


these boundaries we must remove the other, for a boundary cannot 
separate a country from itself; that is not what we mean by a 
boundary. This would yield a country bounded by two completely 
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separate boundaries and shaped like a ring. We have tacitly 
excluded such countries in our proof of Euler’s theorem. However, 
we can avoid forming a ring-shaped country. If Z and L, together 
form a ring, then the other two boundaries of Z must belong to two 
countries L, and L, that are separated by the ring. Consequently 
L, and L, are different and have no boundary in common, so they 
can be given the same color. We remove the boundaries that 
separate L from L, and L, and obtain a new map with f — 2 countries. 
If this map with less than f countries can be colored with 5 colors, 
then so can the original, for L has only 3 neighbors and two of these 
have the same color. That leaves 3 colors available, any one of 
which can be given to L. 

IV. There 1s a country with 5 vertices. The same difficulties can 
arise in an even more complicated form. Either one country may 
have two different boundaries in common with LZ (Fig. 45a), or 
two neighbors of Z may touch each other along some distant boun- 
dary (L, and JL, in Fig. 45b). In both cases LZ has two neighbors, 
L, and L,, which are different and do not touch. This is clear, 
since they cannot cross the ring formed by L and JL, in Fig. 45a 


Fig. 45a Fig. 45b Fig. 46 


or by L, L,, L, in Fig. 45b. In both these cases as well as in the 
simple case where no ring occurs, L has two neighbors LZ, and L, 
which do not meet on a boundary (Fig. 46). Now we remove the 
boundaries between LZ and both L, and L,. The new map has 
J — 2 countries, fewer than the original map. Let us again suppose 
that the new map can be colored with 5 colors and that L + LZ, + 
LI, has the color a, L, has 6, L, has c, L; has d. This is a total 
of 4 colors. On replacing the two boundaries, we can color L 
with the fifth color. 

Every map will belong to one of these four cases, so our reduction 
process is complete. Every map can be reduced by removing one 
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or two boundaries. The reduced map has fewer countries and, if 
it can be colored with 5 colors, so can the original map. We think 
of repeating the reduction and continuing until at most 5 countries 
are left. A map with no more than 5 countries can obviously be 
colored with 5 colors. The same is then true for each map of our 
reduction process, and hence is true for the original map. 

5. We have proved the five-color theorem for maps drawn on 
a flat plane. However, the same proof can be used for maps on 
a globe. The outer country, that we usually thought of as the 
sea, fills the remainder of the globe. An ambiguity would occur 
only if we had not included the sea as one of the countries to be 
colored. Our proof remains the same step for step on the globe. 
We do not have to go over it all again to see this. The proof does 
not make any use of equality of line segments or angles. It does 
not involve any congruence theorem or any other idea that does not 
carry over directly to the surface of a sphere. The only concepts 
that are used are those connected with the relative positions of points, 
curves, and areas, and these are the same on a sphere as on a plane. 

The situation is different if we consider a map on a surface shaped 
like a doughnut. On this surface it is possible to draw a map 
consisting of 7 countries, each of which has a boundary in common 
with all 6 other countries. Therefore this map requires 7 colors. 
It is not easy to visualize a map on this surface without having a 
model in one’s hand, so we shall content ourselves with merely 
stating the facts. Certainly our proof of the five-color theorem 
cannot be carried over to this surface. Why does our proof easily 
carry over to the globe but suddenly fail for the doughnut? 

There are two points at which our proof fails in the case of the 
doughnut, which is usually called a torus. The first is at the end of 
the proof of Euler’s theorem where, with reference to Fig. 40, we 
said that two different paths from P to Q would surround an area. 
The second is in the proof in cases III and IV, where we said that 
L, and L, (Figs. 44, 45a, 45b) could not touch because they are 
separated by a ring made up of L and L, or L, L,, and L,. In both 
cases the proof on the plane or sphere depends on the fact that one 
cannot go from a point on one side of a closed curve to a point 
on the opposite side without crossing the curve. This is not true 
on the torus. In order to help us visualize the surface, let us think 
of the ring of Saturn as a solid rather than the loose mass of particles 
that it really is. This ring is a torus. Let us also suppose that a 
river flows around the ring, always remaining on the side that is 
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away from Saturn (Fig. 47).If we are standing at A, on one bank 
of the river, can we walk to B on the opposite bank without crossing 
the river? All we need do is to walk directly away from the river 


Fig. 47 


to the point C, across the side nearest Saturn to D, and then under 
the ring and up to B. 

This situation shows how careful we must be when relying on 
intuition and how easily our intuition can lead us astray. The part 
of the proof relating to Fig. 40 seemed intuitively obvious at the 
time, but now we see that it really requires a logical proof. This 
proof must depend on special properties of the plane and sphere, 
and it will certainly not remain valid for the torus. 

We shall not go into this deeper analysis of the character of the 
plane and sphere. The properties that we have assumed can be 
proved, but it is not easy to do so. In the case of the torus it can 
be shown that Euler’s formula must be replaced by 


vtf=e. 


Also, our proof of the five-color theorem can be carried over to the 
torus to show that 7 colors are enough to color every map on this 
surface. It is interesting to note that on the more complicated torus 
the color problem is completely solved, while on the simpler plane 
or sphere it is not known whether 5 or 4 colors are required. 


13. The Regular Polyhedrons 


1. We are going to make use of Euler’s theorem to obtain an 
entirely different sort of result. We shall investigate the question: 
Do regular polyhedrons exist, and how many are there? A polyhedron 
is a solid figure bounded by portions of planes called faces. Ac- 
cording to Euclid, a polyhedron is ‘“regular’’ if all its faces are 
congruent polygons having equal sides and angles (regular polygons). 
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Our problem will be more extensive and the answer more satis- 
factory if we use a more general definition. We shall say that a 
polyhedron is “‘regular’’ if all its faces have the same number of 
sides, and if the same number of faces come together at each vertex 
(corner). In this definition we say nothing about equal sides, 
angles, or areas, nor anything about size. Only the number of 
certain parts is used in the definition. 

We shall let y represent the number of vertices possessed by 
each face. If the polyhedron is formed by triangles, then g = 3, 
etc. The number of faces that meet at each vertex can be called e. 
Since a polygon must have at least 3 vertices, we have 


(1) yp = 3. 


In a solid figure a vertex is a point where at least 3 faces meet, 
so we also have 


(2) e= 3. 


We shall let v, f, and e represent the number of vertices, faces, and 
edges of the polyhedron respectively. 

Let us imagine that the polyhedron is hollow and that it is made 
of some flexible substance such as rubber. We can then think of 
blowing the figure up until it becomes spherical. The faces of the 
polyhedron become pieces of the curved surface of the sphere and 
the edges become pieces of curved lines on the sphere. If we 
think of the sphere as a globe, then the original polyhedron has 
become a map representing f countries. Each country was formed 
from one of the faces. The boundaries of the map come from the 
edges of the poyhedron, so there must be e of them. Similarly, the 
map has the same number 2 of vertices as the original polyhedron. 
Each country has the same number 9 of vertices and also of boun- 
daries. Therefore only one of the numbers fy, fg,°°:, (§ 4)*is dif- 
ferent from 0, and this one must be equal to the total number of 
faces f. Exactly e countries come together at each vertex. Exam- 
ples of such maps are given in Figs. 35b, 36, 37. In these examples 
we have 


yg = 3, e= 3; gy = 4, & = 3; yg = 5, e— 38 


respectively. 
By blowing up the polyhedron we have shown that Euler’s theorem 
applies to polyhedrons as well as to maps. That is, in every polyhedron 


we have the formula 
*Chap. 12 
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(3) v+f=e+2 

connecting the number of vertices, faces, and edges. Historically, it is in 
this form that Euler discovered the formula, and this is the form in 
which it is usually stated. 

2. Each face of the regular polyhedron has » vertices and p 
edges. The f faces then account for fy edges in all, but here we 
count each twice, since an edge is a side of each of the two faces it 
separates. Therefore we have 
(4) Jo =2¢. 


Similarly, ¢ faces and hence ¢ edges meet at each vertex. That 
gives ve edges, but each is again counted twice because each edge 
has two ends. This gives us the formula 


(5) se 
From (3) we have 
v+f—e=2, 
and if it is multiplied by 2e it becomes 
2ve + 2fe — ee = 4e. 


Now according to (4) we can replace 2e by fp and, since (4) and 
(5) show that fy = ve, we can also replace ve by Se. This yields 


2fp + 2fe — foe = 4e 


or 
(6) f(2p + 2 — ye) = 4e. 


Since f and 4e are positive numbers, the same must be true of the 
factor in parentheses, so we have 


(7a) 2p + 2¢ — we > 0. 


Formulas (1) and (2) give lower bounds for g and e«. Now we 
want to find upper bounds from (7a). First we can change the 
algebraic signs in (7a) to obtain 


(7b) ye — 29 — 2e < 0. 
If we compare the Jeft side of (7b) with the product 
(8) \p — 2)(e — 2) = ge — 29 = ae 


we see that it differs only by the added term 4. Consequently we 
shall add 4 to both sides of (7b): 


ype — 29 — 28 + 4 < 4, 
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This, with (8), gives us 
(9) (p — 2)(e— 2) <4. 

From (1) and (2) we see that the factors (p — 2) and (e — 2) 
of (9) are at least 1. Therefore (py — 2) and (e — 2) are positive 
whole numbers whose product is less than 4. There is no difficulty 
in finding all products of this kind. They are 
(10) id; ste @ 22: Gt 1 -S, 5 21, 
or five products in all. Every other product of two positive whole 
numbers would either be 4 or a larger number. From the fact that 
there are only five products of two positive whole numbers which are less 
than four, we will deduce that there can be only five regular polyhedrons. 

3. In the five products (10) we can suppose that the first factor 
is (p — 2), the second (e — 2). We then have the five pairs of 
values for o and e: 


We Ww WI? 
ow —» W!nm 


5 3 


We have obtained this table directly from Euler’s theorem. Now 
if we remember the meaning of @ and «, we see from the table 
that regular polyhedrons, if they exist, can have only triangles, 
quadrilaterals, or pentagons as faces. Furthermore, 3, 4, or 5 
faces must meet at the vertices. Now from (6) we have 


4¢ 
f= > 
p + 2e — ve 
and, using (8), this can be written as 
4e 


(11) f 
With (4) this gives 


~ 4— (py — Ble — 2) 


(12) ee ee 
2 4— (yp — 2)(e — 2) 


and from (5) and (12) we find 
2¢ $y 4 
£6 sie fie Hegtee 2)he 2-2) 


The last three formulas give us just one value each for f, e, and 2, 


(13) v= 
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corresponding to each of the five possible pairs of values of y and e. 
These values are listed in the following table: 


a eka 


3 3 4. 6 4 | Tetrahedron 

3 4 S41 12 6 | Octahedron 

4 3 6-12 8 | Hexahedron 

3 5 20 | 30 | 12 | Icosahedron 

5 3 12 | 30 | 20 | Dodecahedron 


Therefore there are only five possible regular polyhedrons. They are also 
called the five Platonic solids and are named, as shown in the table, 
according to the number of faces. 

4, This table has a very special property. If » and « are inter- 
changed, the icosahedron and the dodecahedron are interchanged, 
as are the octahedron and hexahedron. The tetrahedron remains 
unchanged. At the same time e is unchanged: but v and / are 
interchanged. We could have seen this from the previous formulas 
without reference to the table. Since the conditions (1), (2), and 
(9) are symmetrical in m and ¢, any admissible pair of values for 
and é¢ is still admissible if we interchange g and ¢. Furthermore, 
(12) is also symmetrical in y and ¢, so e is not altered by the inter- 
change. Finally, (11) and (13) show that the interchange of 
and e merely interchanges v and f/f. 

This relationship is also clear for purely geometric reasons. We 
need only choose an arbitrary point inside each face of a polyhedron 
and take these points as vertices of a new polyhedron. The edges 
of the new polyhedron can be drawn between every two vertices 
that are on neighboring faces of the original polyhedron. Then 
exactly one vertex of the new polyhedron lies on each face of the 
original, exactly one new edge crosses over each old edge, and 
exactly one new face cuts off each old vertex. Therefore the 
number e¢ is the same in both polyhedrons, while v and f/f are inter- 
changed. 

5. We have left one important point undiscussed. So far we 
have only seen that there can be at most five regular polyhedrons 
since, if all the faces have the same number of vertices and the 
same number of faces meet at each vertex, the number of faces, 
edges, and vertices must correspond to one of the five possibilities 
shown in the table. 
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While the table includes all possible regular polyhedrons, we have 
not yet determined whether these five polyhedrons can actually be 
constructed. It is quite conceivable that some further restrictions 
that we have not taken into consideration might exist. These further 
restrictions might eliminate one or more of the types listed in the 
table. In a word, we have discussed necessary but not sufficient 
conditions for our regular polyhedrons. 

Actuaily, our discussion was concerned more with “regular” 
maps on a sphere than it was with the polyhedrons themselves. 
These maps are merely the polyhedrons after they have been blown 
up into spheres. Now we can actually exhibit a map corresponding 
to each type in the table. Fig. 35b represents the tetrahedron, 
Fig. 36 the hexahedron, and Fig. 37 the dodecahedron. Maps 
corresponding to the octahedron and icosahedron are shown in 
Fig. 48 and Fig. 49. Although all the maps are drawn on a flat 


Fig. 48 Fig. 49 


plane, they may be transferred to a sphere. Now, if we disregard 
the deformation to which the polyhedrons have been subjected in 
blowing them up into spheres, we see that the conditions we have 
set up for regular polyhedrons are also sufficient. 

This completes the discussion of our problem, but we have not 
shown that our figures can be constructed as regular polyhedrons 
according to the narrower definition of Euclid, that is, with con- 
gruent, regular plane polygons as faces. To prove this would 
require the use of concepts and theorems of an entirely different 
sort. Our investigation has used only such properties as remain 
unchanged under such deformations as stretching. Congruence is 
not such a property. It would require the use of metric geometry, 
in which equality of lengths and of angles is of importance, to show 
that our polyhedrons can be constructed according to Euclid’s 
definition. We shall not carry out this proof. The existence of 
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exactly five regular polyhedrons in the narrower sense dates from 
classical times. The proof is attributed to Theaetatus, a student 
of Plato’s, and it is given by Euclid at the end of Book XII of his 
Elements. 


14. Pythagorean Numbers and Fermat’s Theorem 


1. According to the Pythagorean theorem, the square on the 
hypotenuse of a right triangle has the same area as the sum of the 
squares on the two legs. Conversely, if three line segments are 
such that the square on one is equal to the sum of the squares on 
the other two, then the three segments will form a right triangle. 
The equation a? + 52 = c® represents the fact that the segments 
of length a, 6, c are the sides of a right triangle. 

We have already seen in Chapter 4 that the hypotenuse and legs of 
an isosceles right triangle are incommensurable, that the equation 
2a = c* can never be satisfied by whole numbers a and ¢. Are 
there any right triangles in which the sides are commensurable? 
In other words, can the equation 


(1) a - >* == ¢* 
be satisfied by three whole numbers? A simple and very well known 
example shows that the answer is yes: 


32 + 42 — 52 or 9 + 16 = 25. 


Are there other answers? How can we find them? In this chapter 
we shall find the complete answer to these questions. 

2. If we have a solution a, 5, ¢ of (1), we can easily find another 
by multiplying each of the terms a, 6, and c by any whole number. 
Since 3, 4, 5 is a solution, we can multiply by 2 to find 6, 8, 10. 
This gives us 

62 + 82 — 102, 


More generally 3n, 4n, 5n will be a solution if 2 is any whole number. 
In the same way, if a, b, ¢ is any solution, then an, bn, cn is also a 
solution, for from a? + 62 = c? we have a?n? + b2n? — cn? or 

(an)? (bn)®=(cn)?. This way of finding new solutions is trivial 
and therefore not of much interest. It is more interesting to find 
the basic solutions, those that can’t be found merely by multiplying 
another solution by some whole number. We shall call suct 
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solutions “‘reduced solutions’’, those solutions in which a, }, and c 
do not have a common divisor. Thus 3, 4, 5, is a reduced solution. 
If two or more numbers have no common divisor we shall say 

that they are “relatively prime.” In a reduced solution, each pair 
of the numbers a, 4, ¢ is relatively prime. For if a and 4, say, had 
a common divisor d, they would also have every divisor of d as a 
common divisor. Some prime f would divide d; at worst, d would 
be p itself. Then a and 6 would have the common divisor p and we 
could write 

a= pay, b = by. 
Equation (1) would then become 

pr(ay + Bi) = &, 
from which we see that p? would divide c2. Then p would divide 
c? =c.c, and hence one of the two equal factors c. That is, p 
would divide ¢ as well as a and 6. In the same way, a common 
prime factor of a and ¢ or of 6 and c is acommon factor of all three 
numbers. 

3. Now we are looking for the solutions a, 5, ¢ of (1), in which 
every pair of a, b, and ¢c is relatively prime. No two of the numbers 
can be even, that is, divisible by two; at most, only one can be 
even. Neither can all three numbers be odd, however. The 
square of an odd number a = (2/ + 1) is a? = 4/? + 41 + 1, which 
is again odd. Then, if a and b are odd, so are a? and 32, so a? + 8? 
is even and cannot be equal to the square of an odd c. 

The only possibility that remains is that two of the numbers a, 
6, c are odd and one is even. Furthermore, we can see that ¢c must 
be odd, for if ¢ is even it is divisible by 2 and ¢? is divisible by 4. 
The other two numbers must be odd, 


a=2+1, b=2m+1, 
and we find 
a? + 5% = (4/2 4 41 + 1) + (4m? + 4m + 1) = 4(2 + 14+m2+m) + 2. 


This number is even, but on division by 4, it leaves the remainder 2 
and therefore could not equal c?, which is divisible by 4. 

We are now left with ¢ odd and one of the numbers a and 3 even, 
the other odd. We shall let a be the odd number, 4 the even one. 
Thus in our example we have a = 3, b = 4, ¢ = 5. 

4. Equation (1) can be written in the form 


(2) b? = c®? — q® = (¢ + a)(¢ — a). 
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Here (¢ + a) and (¢ — a), being the sum and difference of two odd 
numbers, are both even. Their only common factor is 2. In other 


Rika pacts 


C ; 
words, are relatively prime, as we can see by 


supposing that d divides these two numbers. Then 
cta c—a 
ome df 
2 U 2 


and on adding and subtracting these two equations we find 


c= d(f+ 8g), @= df — £). 


Then d divides both a and ¢, and this contradicts our assumption 
that they are relatively prime. 

Since 5, ¢ + a, and c — a are all even, we can write (2) in the 
form 


(3) 


= dg, 


(= ) ctac—a 

fo as 

where the fractions are only apparent since each is actually a whole 
b\2 

number. This equation expresses the square (=) as a product of 


c+a c—a 


and We now come to the 


two relatively prime factors 


c+a 


, c—a 
essential step of the proof. We prove that and — must 


b : 
each be squares. If is factored into prime factors, 


b 
= eau pr’ ree 
where p, 9, 7,:** are different primes, then we have 
b 2 
(=) sina preg rev une 


c—a 
taken 


a 
and 


b\? e+ 
All the prime factors of (~ must appear in 


together. However, each prime factor p must either appear only 


¢c—a c+a c—a 


eb i a : ‘ 
in or only in s Synce and have no common 


b 2 
factor. ‘Therefore the prime factors of (2) are distributed between 
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c+a c—a, : 
Z and in such a way that each prime power p*, g?°,r2”, --+ 
; ; C a ¢—a c+a ¢c—a 
goes entirely into or Therefore and con- 


tain only even powers of their prime factors and consequently each 
is a square. 
5. We can now write 


(4a) 


(4b) (3) = uy, 


where uw and 2, like u* and v*, are relatively prime. From (4b) 
we have 
(5) b = 2uv, 


and by addition and subtraction of the equations (4a) we find 
(6) e=u®?+ 07, a= uv? — 2, 


Since ¢ and a are both odd, one of the squares u? and v? must be even 
and the other odd, in any other case their sum and difference would 
be even. The same must be true of u and v. We shall say that 
two such numbers are of “opposite parity’’. 

We have now proved that if a, 6, c is a reduced solution of (1), 
then a, 6, and ¢ can be represented in the form (5) and (6) by means 
of two numbers wu and v which are relatively prime and of opposite 
parity. In our old example, a = 3, b = 4, c= 5, we have 

5+ 3 5— 3 
Pn 2s as y= —___ =], 
2 2 
Ses, oe 1, 
and, in agreement with (5), 
pee Se Fs Bs TY — Dye. 


6. As yet we have only found necessary conditions for a reduced 
solution. We have started with a reduced solution a, b, ¢ and 
have determined # and v. To complete the discussion we must 
show that the conditions are sufficient, that the a, b, c given by (5) 
and (6) is always a reduced solution when u and v are relatively 
prime and of opposite parity. We shall also add the condition 
u > v to ensure that a is positive. 
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In the first place, a simple computation yields 
(u? — o8)® + (2uo)* = (u® + 022, 
so the numbers given by (5) and (6) satisfy the equation (1). 
Secondly, to see that we have a reduced solution, we remember that 
u and v are relatively prime and one of them is even, the other odd. 
Equation (6) shows that a and ¢ are both odd, so a, b, c do not have 
the common factor 2. They cannot have an odd common factor 
either, for if they did they would have, as common factor, an odd 
prime p (any prime other than 2). We could then write 


C= pty, &4= pd, 
and, from (4a), we would have 
2u?=c+a= p(y + 4), 
2u® = ¢ —a= p(e, — a,). 


These equations imply that p divides both 2u? and 2v?. Since p 
is different from 2 it would divide u? and v?, but this is not com- 
patible with the fact that wu and v are relatively prime. 

Some examples of Pythagorean numbers derived from (5) are 
given in the following list: 


oe ee on 2 a Se eee 
a= 3, 0 =o goes. Oe ee, eee 
tee Sccdec dt ge 1b, be Be ST 
Oth, pee Ge ae 7, OR ee Eo 
ee Oo Wa a ee 21, ee 0, 
ont, pag ge. Pe, 6 = 


A glance at the table shows that not only is } even, but it is always 
a multiple of 4. This is true because ) = 2uv, and either u or v 
is even. 

7. Now that we have completely solved the problem connected 
with equation (1), a whole series of generalizations comes to mind. 
We can ask for a similar discussion of the equation 


(7a) i aa. 
or of 

(7b) x4 + yt = zt, 
or, more generally, of 

(7c) x" + yee 
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for any n> 2. Pierre De Fermat (1601-1665) asserted that the 
equation (7c) has no solution in positive whole numbers x, y, z 
forn > 2. This statement, that has never been proved or disproved, 
is called Fermat’s theorem, or Fermat’s last theorem, to distinguish 
it from another Fermat theorem which we will mention in 
Chapter 23. However, the assertion has been proved for certain 
values of n. For example, it has been proved for all n from 3 to 
100 by Kummer (1810-1893) and his followers. Before this, Euler 
(1707-1783) had proved it for (7a) and (7b). 

With the aid of our knowledge of Pythagorean numbers, we can 
easily prove that (75) cannot be solved in positive whole numbers. 
We shall even show that the equation 


(8) tas we? 


cannot be solved in positive whole numbers. Since every fourth 
power is a square, but not every square is a fourth power, the 
insolubility of (8) is more significant than that of (7b). 

In discussing the solutions, we insist on positive whole numbers in 
order to eliminate certain trivial solutions. For example, (8) is 
certainly satisfied by x = 1, y = 0, w = 1, while (7a) is satisfied 
by x = —7, z= 0. | 

8. If we write (8) in the form 


(9) Pot (97)? =; 


we see that it is a special case of (1) with a= x*, b= y?3, c= w. 
Since we need only consider ‘“‘reduced’’ solutions x, y, z, we can see 
(as we did above) that if (9) has a solution w must be odd, while 
one of the squares x* and y? must be even and the other odd. We 
can let x? = a be odd and y? = 6 be even. If (9) has a solution, 
it must be given by (5) and (6). There must be two relatively 
prime numbers u and 2, of opposite parity, determining the numbers 


(10a) x? = uy? — 2; (26D) «27? = 2uv; (10c) w= u? + v?. 
Now (10a) can be written as 
(11) x? + yp? = y?, 


which is a new Pythagorean equation with x, v, u relatively prime. 
Here u takes the place of ¢, so it is odd. Since x is also odd, v must 
be even and it must take the place of b. The reduced solution x, 
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v, u of the Pythagorean equation (11) must again be given by (5) 
and (6) by means of new relatively prime numbers u, and »v, having 
opposite parity. That is x, v, u are given by 


(12) os em Oy Day 5 ea +m. 


We now go back to equation (10b). Since wu is odd, v even, and 
the two are relatively prime, we see that u and 2v are prime. This 
is because 2 does not divide u. Therefore (10b) expresses y* as a 
product of two relatively prime factors u and 2v. According to the 
discussion in § 4, such a factorization of a square is possible only 
if each of the relatively prime factors is a square. Therefore we 
have 


(13) n= WA, So =e, 


where we have made use of the fact that 2v is even. If we insert 
these two values in the last two equations of (12) we find 


(14) B= un, wad +o, 


Now uw, and », are relatively prime and their product is ¢, so they 
must again be squares, 


(15) =, =I. 
The second equation of (14) now becomes 
(16) xt +9} = wy. 


9. Equation (16) is of the same type as the original equation (8). 
Starting with one reduced solution x, y, z (positive numbers) of (8), 
we have found another reduced solution for the same equation. 
The second solution is obtained from the first by means of a certain 
process. Without going through the whole process, we can see that 
the value of w in the first solution is larger than the value of w, 
in the second. For, by (10c) and the first equation of (13), we have 


=V+e27%=wi+o> wi 


and hence w > w. 

This will allow us to obtain a contradiction. Just as we went 
from x, y, w to x1, y;, w, in § 8, we can go from %, »,, w, to another 
solution x9, 7,9 W2, with w, > we just as we had w > w,. Repeating 
the process will give us yet another solution x3, 73, w, with ws > w3. 
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Continuing in this way, we obtain a series of solutions with 
(17) w>w>w,>w3>°°° 


These numbers are all positive whole numbers. There is only a 
finite number of positive whole numbers less than w, so the sequence 
(17) must finally end, say with w,. But this w, belongs to a solution 
Xz, Jy, W,, and we can apply the process of § 8 to it to obtain still 
another solution %;45, Yeii1, Wey, With w, > w,y4,. This contradicts 
the fact that (17) ends with w,. Therefore we have established 
the fact that (8) does not have any solution in positive whole num- 
bers, since the assumption of such a solution has led to a contra- 
diction. 

The basic idea of this proof was called the “principle of infinite 
descent” by Fermat. It consists of obtaining a contradiction by 
finding a process (in this case the repeated application of § 8) that 
yields a never-ending sequence of decreasing positive whole numbers. 
Such a sequence must end, since there is only a finite number of 
positive whole numbers less than n, the numbers (n — 1), (n—2),°*°, 3 
2, 1, which are only (m — 1) in number. 

We have used the principle of infinite descent once before; it 


was essential to our proof of the irrationality of V/2 in Chapter 4. 


15. The Theorem of the Arithmetic 
and Geometric Means 


A careful experimenter measures a certain object and finds its 
length to be 2.172 feet. On repeating his measurements twice 
more he obtains the lengths 2.176 ft. and 2.171 ft. What should 
he accept as the true length of the object? In such a case it is custo- 
mary to use the average of the measurements, to add them and 
divide by their number. The experimenter would find the total 
to be 6.519 and, dividing by 3, would accept 2.173 ft. as the length 
of the object. This average that we have described is called the 
arithmetic mean. The arithmetic mean of n numbers 4, 4, 
a3,°**, @, 18 


1 
(1) a i, ee 


We shall soon see that there are other possible kinds of averages 
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besides the arithmetic mean, but first let us see just what sort of 
thing an average should be. If the experimenter had made the 
same measurements in inches instead of feet, we would certainly 
agree that the average obtained should be the same as before, except 
that it would now be expressed in terms of inches. Each measure- 
ment would have been multiplied by 12 and the result of averaging 
should be multiplied by 12. The same thing should be true whether 
we multiply by 12 or by any other number #, since the experimenter 
could have used any sort of ruler to make his measurements. If 
the arithmetic mean (1) is to be a reasonable sort of average it must 
have this property: if each a is multiplied by ¢ the final result A 
must also be multiplied by ¢. Thus, 


] 
(A) iA = a (ta, + ta, + tag +--+ + fa,). 


This is clearly true, since dividing both sides of (A) by ¢ gives 
equation (1). We shall express this property (A) by saying that 
the arithmetic mean is “homogeneous”. Anything that is to be used 
as an average should be homogeneous. 

A second, simpler property that an average should have is this: 
the average of a,, d,, d3,°°*, a, should neither be smaller than the 
smallest a nor larger than the largest a. It would be unreasonable 
for the experimenter to have obtained, say, 2.177 as the average of 
his measurements. The arithmetic mean (1) has this property, 
since the sum of m numbers is certainly no less than n times the 
smallest, and is no larger than n times the largest of the numbers. 

There is one fairly obvious way in which the arithmetic mean can 
be changed. If the experimenter had made the last of his three 
measurements considerably more carefully than the other two, he 
might feel that he should give more weight to the last measurement, 
2.171. If he felt that it deserved 4 times the weight of the other 
measurements he could list the first two measurements 2.172 and 
2.176 and could then write down the third one 4 times. Now to 
average these he would add them all and divide by 6, obtaining 
2.172 ft. It might have been easier to list the last measurement 
Just once but to remember to multiply it by 4 and to divide the total 
by 1+ 1+ 4+ 6. Such an average is called a weighted arith- 
metic mean. If a, is to be given a weight w,, a, weight wy, dg 


weight w3,°--, a, weight w, then the weighted arithmetic mean is 
1 

I ee Seca 

Wy + Wye t+ Wy +++ Wy 


(44, + Wed, + Wag + °*** + W,G,). 
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It is quite easy to verify the fact that the weighted arithmetic mean 
has the two properties that should be shared by all averages. Al- 
though the topics which we shall discuss can be extended to weighted 
means, we shall not attempt this extension. It would only make 
the notation more cumbersome without adding anything to the 
essential ideas. 

Turning to a new example, suppose we have 5 squares, two of 
side 1 in., one 2in., one 5in., and one 7in. What is the average 
size of the squares? If we are interested in the lengths of the sides 
of the squares we merely find the arithmetic mean of the lengths, 
3.2 in. If we are interested in the areas of the squares, however, 
we have two of area 1 sq. in., one of 4 sq. in., one of 25 sq. in., and one 
of 49 sq. in., and the arithmetic mean of the areas is 16sq.in. This 
corresponds to a square of side 4in. With respect to sides we would 
want to say that the average square has side 3.2 in., but with respect 
to areas the average square is larger, having sides of 4 in. If we 
think about how we obtained the number 4 we see that we squared 
the lengths, added these squares, divided by their number, 5, and 
took the square root of the result. Such an average is called the 
root mean square. The root mean square of the numbers 4, 4p, 
Gs,°**, Gy is 


1 
2) paVidt dt ate + ob) 


This average also has the two properties that we demand of all 
averages. We shall only verify that it is homogeneous. If we 
multiply each a in (2) by ¢t we have 


y2 (af+ap+fas+ ---+fa?) = y2 ?(atj+ap+a2+-+++a?) 


; = 
oe (2+ a-+ah+ +++ +03), 


but this is just the R of (2) multiplied by ¢. 

In the preceding example the arithmetic mean is not larger than 
the root mean square. This is not a mere accident of the example, 
but is true for any set of numbers 4,, ag, @3,‘-*, a,. In order to 
prove this fact we shall first prove another fact, (6), which is of 
considerable interest in itself. 

If any number, whether positive, negative, or zero, is squared, 
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the result is either a positive number or zero. Therefore we have 
bess 2 


no matter what the numbers ¢, and d, are. Adding 2c,d, to both 
sides of this inequality we have 


2,4, Sc? + d?. 
In a similar way we have 


Wed, Sc + AZ, 


2¢adg Sc? + a2, 


2¢,4, = + d?. 


Since the left side of each inequality is less than or equal to the 
right side, the sum of the left sides is less than or equal to the sum 


of the right sides, 


2c,d, + Wed, + 2d, +--+ + 2c,d, 
<@+@+2+d4+24+8+4+---4+2+4+a@. 


We can rewrite this in the form 


(3)  2(cydy + Cody + Cgdy +++ + Cpd,) 
S(G@+qtagt::+ +e) + (qit+a+dg+---+d;). 
This holds for any numbers ¢, ¢%, ¢3, * * *, €ny 4, de, G3, °° *, d,, Now 


if a,, 4, dg,° °°, @, and b,, by, bs, +++, 5, are also any numbers, we 
can put 


(4) _ PAL ates ae 
ay ig a ae 

Cy = 701, Cg = 10g, Cg = Tas, "°°, C6, = 7a,, 

b b bs b,, 


1 2 
d,= —,d, = —,d, = —,°+',d, = —. 
Tr : r r 


b 


: 1 
SINCE 6:4, == fay? a Gyb,, Cede = Gola, : =") 60 oO, (3) ROW 
becomes 


(5) 2(a,b, + agbg + agbg + +++ + 4yb,) 


Bw oh b? 
& (fat + al ae + Pe (peas. +), 
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Using (4), the first part of the right side can be written as 
(ay + a3 + a3 + +++ + ay) 
[2 Brat Serer ees 
hae = ’2 ’3 ds a Pn (@i+a3+a3+ - ++ +a5) 
mot a 1 oS * 
= Vai tag + ag + ++ + an) (Ot + 05 + 05 + +++ + 05). 
Similarly, the second part of the right side of (5) is 
] 
— (bf + b+ b+ +++ +52) 
cas ay + ay + a3 -+a an SS a oh 
iia ee PLETE OR 2 3 bf 
= V(aitat+ait--++a3)(bi+ B+ 8+ --+ +62), 
and (5) becomes 
2 (a,b, + deb, + a3b3 + +++ + a,b,) 
S2V (qi + at aft +++ + an)(B + b+ bg + +++ + 4). 
Dividing each side by 2 we have 
(6) a,b, + Ade + as), = Ts a a,b, 
SV (ap + ag + a5 + ++ +09) (OE + OE + 05 + +++ + Of). 


This is Cauchy’s inequality, named after the French mathematician 
A. L. Cauchy. 

In order to show that the arithmetic mean is not larger than 
the root mean square, we remember that the a and b in (6) can be 


We then 


any numbers and we take 4, = 6, = db, =:-:=0b, = 


have 
a a a 
eee ee 
n n n n 
<|/ 2a A eats gli | ae 
= G+ a+ ag ++ +4, eet George wil 1s tiny 
| = ; 
where there are exactly m terms — in the last parenthesis on the 
n 


os 1 1 
right. This is equal to n- ae a and our inequality becomes 
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] a 
—<(@, + aya, +s a) S (aj t+astai+--++a?), 
n n n 


This is just what we wanted to prove, since the left side is the arith- 
metic mean (1) and the right side is the root mean square (2) 

It is worthwhile to review the method by which we proved (6). 
The essential idea is that the square of a number is never negative. 
From this we obtained a number of inequalities whose sum is (3). 
Now (3) was not what we were looking for, but we were then able 
to obtain (6) merely by replacing the letters ¢ and d with certain 
combinations of other letters. This idea of replacing one set of 
letters with another is often used in the study of inequalities, some- 
times with very surprising results. In the proof of (6) it would 
have been possible, although not quite so simple, to prove (6) first 
and then to obtain (3) by changing the letters in (6). 

In order to introduce one more sort of average we shall look at 
another example of measuring. An object is to be weighed by 
putting it on one pan of a balance and counterbalancing it with 
weights on the other pan. If the balance is not correctly built or 
has been damaged there may be a slight difference in the lengths 
of its two arms and this will make the weighing incorrect. It is 
usually impossible to measure the lengths of the arms accurately, 
so we shall make two weighings, one with the object in the left pan, 
one with it in the right pan. If the results of these two weighings 
are a, and a,, what should we take as the true weight? Should 
we use the arithmetic mean, or is some other kind of average better? 
To answer this question let us suppose that the length of the left 
arm of the balance is / and that of the right arm is r. We cannot 
measure these lengths but the arms must have certain lengths. 
Now it is shown in elementary physics that the product of the weight 
by the length of the arm on one side is equal to the corresponding 
product on the other side. If w is the true weight of the object 
this means wil = a,r for the first weighing and wr = a,/ for the second. 
If we multiply the left sides of these equations, and multiply the right 
sides we find wlwr = a,ra,/ or, dividing by Ir, w* = a,a,. The true 
weight of the object is given by the formula w = Va,a,._ This is 
a new sort of average. It is called the geometric mean of the num- 
bers a, and dg. 

What is meant by the geometric mean of more than two numbers? 
If we are given n positive numbers 4a,, ds, dg, °° *, @,, We will first 
multiply them together to obtain a,a,a,---a,. Should we then 
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take the square root of this number? To decide this, we remember 
that an average must be homogeneous. If we multiply each a by 
t this product becomes ta, « ta, + tag+-+ ta, = t"a,4,a,++-+a,. If the 
final result is merely to be multiplied by ¢, we must take the n-th 
root, not the square root. We can now say that the geometric 
mean of the n positive numbers a,, dy, d3,°**, a, is 


(7) G = Vajaoa,*** a,. 


One of the most famous theorems connected with means is the 
theorem of the geometric and arithmetic means. It states that the 
geometric mean of n positive numbers is never larger than the 
arithmetic mean, G < A. There are several proofs of this theorem. 
We shall reproduce a particularly interesting and simple proof 
given by Cauchy. 

We first prove the theorem for two numbers a, and ay. We have 


esa) 
2 


1 
(az + 24,4, + a3) ae a (az — 24a, + a3 + 4,45) 


a, — a,\? 


5 ) + ap. 


(az — 2a,a, + a3) + 44, = ( 


a,—a a, +a 
Now since the square = st) is not negative, wesee that (A=) 


IS 4,4, increased by an amount that is not negative. That is, 


a, + = 


(8) hs (2 


The square root of the left side is G and the square root of the right 
side is A, so we have proved the theorem for any two numbers, 
a, and ay. 

Now we want to prove the theorem for more numbers. It would 
be natural to try three numbers but as it turns out, it is easier to 
prove it for four, Since (8) is true for any numbers a, and aj, we can 
replace a, by a, and a, by a, and have 


a, + “) 


= ( 2 


From this and (8) we have 


a + a,\"(a3 + ay\? 
AA 4,0, S ey (24 ) 
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which can be written in the form 


0 amma CEILI 


Again, since (8) holds or any numbers, we can replace a, by( 


a =) to get 
2 


a *2) 
2 


and dy, by ( 


a, + *) e (2 + a4 


(a+ %) (=) = ( 9 9 
; < 


2 


2 
al 
4 


2 
Using this in (9) we find 


a a ‘ 
(10) Ay AyA30,4 <= (Ato eat) : 


This is our theorem for four numbers, since the 4-th root of the 
left side is G and the 4-th root of the right side is A. 

We can repeat this argument again to prove the theorem for 
eight numbers. Using (10) we have 


Gd, + dy + ag + 3 (** +a,+4a,+ =) 


Ay AoA eh 4A 5Aglydg S ( : - 


and we can apply (8) to the right side as before to obtain 


stot ety 


O10 M,04A5AgQ7d, S ( : 


This gives us the theorem for eight numbers. 

If we continue the proof in exactly the same manner, we see that 
G <A for the set of n positive numbers 4, a, a3,°**, @, if n is 2, 
4, 8, 16,--+ = 2, 22, 28, 24,---, that is, if m is a power of 2. We 
must still prove the theorem for n = 3, 5, 6,---, the numbers that 
are not powers of 2. This is now fairly easy. If a, dp, a3,°**, a, 
are any positive numbers, we can find a power 2” of 2 which 1s 
larger than n (ifn = 50 for example we take m = 6 since 2°=64>50). 
We then write 


b, = G, by = 4, by = Gy, ***, 0g = 4, 
Oe eee 


where A is the arithmetic mean (1) of the a’s. Now 4,, dg, b3,° °°, 
bom are a set of 2” positive numbers; so, from what we have already 
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proved, their geometric mean is no larger than their arithmetic 
mean, 


sm bys S (bh + by + bg +e ** + dom): 


Raising both sides of this inequality to the 2”-th power and in- 
serting the values of the b’s, we have 


Q,4,4,°°*a,AA---A 


] * 
= | = (Q-4,4+-4,4- > ** a, 4+-A+A-+ >+: 4-4) 
or 


= 1 gm_n \2” 
ayayty 0,4" <{ (a, + 05 + 0g + *** +44) + om 4| . 


By (1) the right side reduces to 


l 2 nF n+ 2™—n 
te <a =4} =| gm 


gm 
4| ay 


and we can divide both sides by A?"~” to obtain 
Q1Aa3 Ee an =< A”, 


Taking the n-th root of each side, we have 


V dydq0,°* +a, SA = (+ dy + +++ +64), 


or G < A, and this is the theorem of the arithmetic and geometric 
means. 


16. The Spanning Circle of a Finite Set of Points 


1. We consider a finite set consisting of m points P,, Ps,°°-°, Pas 
in a plane. The distances between each pair of points P; and P,; 
can be measured. There must be a largest distance among this 
finite 1 set of distances. This largest distance is called the “span”’ 
of the set of points. 


1 Using Chap. 8, § 9, IV, it is easily seen that one can find just 
different pairs among the n points. 


n(n — 1) 
2 
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If a set of n points has span d, then we can draw a circle of radius 
d that completely surrounds the n points (Fig. 50). All we need 


Fig. 50 
do is draw a circle of radius d with any one of the n points, say P,, 
as center. Since the distance of every other point from P, is at 
most d, the circle encloses all the other points P,, P5, +: «, P., as well 
as its center P,. 

However, it is easy to construct a smaller circle that still encloses 
all the ” points. First we find the pair of points whose distance 
apart is d. If there are several such pairs we can choose any one 
of them. Calling these two points P, and P,, we draw two circles 
of radius d about them (Fig. 50). The circle with P, as center 
passes through P, and vice versa. Now all n points of the set lie 
in each circle, so they lie in the part of the plane that is common 
to both circles, the part that is shaded in the figure. If the two 
circles intersect at S, and §,, then the circle having S,S, as diameter 
encloses the whole common area and hence the whole set of n points. 
The radius r of this new circle can be found by applying the Pytha- 
gorean theorem to the triangle P,MS, 

72 = gq? — (5) = d2, r = V3. 
2 4 2 
We shall call a circle that surrounds all n points of the set an ‘‘en- 
closing” circle of the set. Besides the original enclosing circle of 
radius r = d, we have found an enclosing circle of radius 


r= 5 V3 = 0.866-- +d, 


2. Can this number $V3 be replaced by a still smaller one? 
The answer is given by the following theorem, discovered by H. W. E. 
Jung: Every finite set of points of span d has an enclosing circle 


eo 
of radius no greater than _ V3 = 0.577++-d. There are many 
finite sets of points with smaller enclosing circles, but some require 
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a circle this large. On the other hand, since two points cannot 
be further apart than the diameter 2r of the enclosing circle, we have 
2r = d, and therefore the enclosing circle can never have radius 


d 
less than “7 The proof of Jung’s theorem will be the aim of this 
chapter. 
d os 
An enclosing circle of radius = \/3 can easily be found for the 


set of three points that form the vertices of an equilateral triangle 
of side d. It is the circle circumscribed about the triangle. Using 
A 


rig. 51 
the notation of Fig. 51 and letting r + * = h, we have 
72 
d*? = fh? + — 
s + 
from the right triangle ABD. This gives us 
3d? 
1 2 __ 
(1) p= = 
and 
d 


Now from the triangle DCM we have 


d2 
I 4 or 


and hence 


oe 
da 
2 rte 4 e 
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Using (1), we obtain 


d* e= Bhy. 
72 
ie 
and then, by (2), 
d? d 
Se ee gS, 
fae ae 


For the equilateral triangle it is evident that the circumscribed 
circle is the smallest possible enclosing circle. However, we shall 
not dwell on this point, since it will arise naturally later on. 

3. In order to prove Jung’s theorem for an arbitrary finite set 
of points, we will start by trying to choose a circle of least possible 
radius from all possible enclosing circles. We will use a series of 
steps that will continually lead us to smaller enclosing circles. 

I. An enclosing circle C, that has no points of the finite set § 
on its circumference can always be replaced by a smaller circle C4. 
We can draw C, with the same center M as C,, and passing through 
the point (or points) of S that are farthest away from M. 

II. If only one point of the finite set S lies on the enclosing circle 
Cs, then this circle can be replaced by a smaller one (Fig. 52). 


Fig. 52 


Letting P, be the point of S on the circumference of C3, we draw 
all the circles that pass throught another point of $ and have the 
same tangent as C, at P,. These circles all lie inside C3. Let us 
call the largest of all these circles C,. It is different from C3, since 
P, is the only point of S on the circumference of C3, while C;, has 
another point of S on its circumference. Now C, encloses all the 
new circles and therefore all the points of §. Furthermore, it has 
two of the points on its circumference and it is smaller than Cj. 

III. The points of S' that lie on the circumference of an enclosing 
circle divide it into arcs. For the sake of brevity, we shall call an 
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arc “‘point-free” if no points of S lie on the arc, except that the 
end points of the arc can be points of S. Our third step for di- 
minishing circles can now be stated: If a point-free arc on an en- 
closing circle is more than one-half the circumference of the circle, 
then the enclosing circle can be replaced by a smaller one. ? 

Let P, and P, be two points of § that are end points of a point- 
free arc 6 of the enclosing circle C;. Furthermore, let 5 be larger 
than half the circumference (Fig. 53). We draw the circle C* 
with P,P, as diameter. If all the points of S' are enclosed by C*, 


Yy 
Vt 


then it is an enclosing circle smaller than C;; for Cs, in which the 
chord P,P, is not a diameter (otherwise ) would be one-half a 
circumference), must be greater than C*. If not all the points of 
S are enclosed by C*, then the remaining points must be in the 
crescent shaped area between b and C* (the shaded area of the figure). 
No points of S other than P, and P, lie on 4, since it is point-free. 
We now draw all possible circles that pass through P,, P,, and a 
point of § lying in the crescent. The part of the circumference of 
each of these circles that is inside C; is inside the crescent and there- 
fore outside of C*. The part that is inside C* is outside of Cy. 
Let C, be the circle whose arc in the crescent extends furthest from 
the chord P,P,. This circle encloses all the points of S, since it 
encloses all the points of S§ that lie in the crescent as well as the 
area common to C; and C*. ‘This area contains all the remaining 
points of §. Furthermore, C, is smaller than C;. Its circumference 
lies between the circumferences of C’, and C*, so its center is nearer 
to the chord P,P, than the center of C;, and hence it is smaller. 

If an enclosing circle can no longer be decreased by means of 
I, II, or III, then it cannot have a point-free arc that is larger than 
half a circumference. Such a circle must either have two points 


2 Cases I and II can be considered as special cases of III, since in each of 
them the complete circumference is a point-free arc. 
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of S as ends of a diameter, or it must have three or more points 
of S dividing its circumference into arcs that are less than half a 
circumference. We shall call the first kind of enclosing circle a 
“diametric circle”, the second kind a “three-point circle’. The 
steps I, II, III applied to any enclosing circle will eventually lead 
to one of these two types. It is possible for an enclosing circle to 
be of both types at once, for example, a circle through the four 
vertices of a square. 

4. Now we think of having drawn all possible circles that have 
two points of the finite set as ends of a diameter, as well as all that 
go through three points of the set.? Not all of these circles will be 
enclosing circles, but every diametric circle and three-point circle 
will be included. Since altogether there is only a finite number 
of circles, there can be only a finite number of diametric and three- 
point circles. ‘Therefore we can compare all these particular circles 
and pick out the smallest one. This circle c¢ is the smallest possible 
enclosing circle. For it is the smallest of all diametric and three-point 
circles and, by I, II, and III, it must then also be smaller than any 
other enclosing circle. Furthermore it is unique. If there were a 
second enclosing circle c’ of the same size (Fig. 54), then S would 
lie in ¢c’ as well as in ¢. Then S would lie in the area common to 
the two circles. Since this whole area can be enclosed by the smaller 
circle c*,.this would contradict the minimal property of c. We shall 
call this uniquely determined smallest enclosing circle of the finite 
set S the spanning circle of the finite set of points S. 

The spanning circle ¢ can have no point-free arcs of more than 
half the circumference since, according to III, it would then not 
be the smallest enclosing circle. 

5. Now we shall show that the radius of the spanning circle 


d 
cannot exceed 3 V3. For this purpose we pick out a pair of 


points of S' lying on the circumference of ¢ that are at least as far 
apart as any other such pair. The distance 6 between these points 
Is certainly no greater than the span d of S. 

First, it may happen that two points of S are ends of a diameter 
ofc. In this case the diameter 2r of ¢ is equal to 6 <d. Therefore 


d d 
we have r > and certainly r< 3 /3. 


—]1 — 1)\(n—2 
8 There are, at most, titers i ne: tA) such circles. 
1-2 L232 <3 
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Secondly, this may not be the case. Then we pick out the 
largest point-free arc b on the circumference of c. If there are 


B 3’ 
& a, 
a B 
b 


Fig. 54 Fig. 55 


several such arcs of equal size, we just pick out any one of them. 
The end points P, and P, of 4 will be points of S. The arc b is less 
than a semicircle, since no point-free arc of ¢ can be greater than half 
the circumference, and if it were exactly half the circumference P, 
and P, would be ends of a diameter and we would be back to the 
first case. Now we draw the chord P,P, and erect perpendiculars 
at its ends (Fig. 55). These perpendiculars will cut the circle in 
two other points, Q, and Q,. The arc b’ cut off by Q, and Q, 
lies opposite 4 and is congruent to it. The points Q, and Q, do 
not belong to S, for Q, and P, are ends of a diameter, as are Q, 
and P,. Hence if Q, or Q, belonged to S, we would again be back 
to the first case. However, the arc b’ cannot be point-free. For if it 
were it could be extended past Q, and Q,, which do not belong 
to S,, to form a larger point-free arc. And this is impossible, since 
no point-free arc can be larger than 3. 

Consequently, there is at least one point P, of S between Q, 
and Q, on b’. The points P,, P,, P3; form an acute-angle triangle. 
The angles at P, and P, are acute, since they are smaller than the 
right angles that we constructed at these points. The angle at 
P, intercepts the arc b, which is less than a semicircle. Therefore 
P, is also acute, since an angle inscribed in a semicircle is a right 
angle, and smaller angles correspond to smaller arcs. 

The circumference of ¢ is divided into three arcs by P,, P,, Ps. 
One of these arcs must be at least as large as one-third the circum- 
ference, but it is less than a semicircle, since the angles of P,P,P, 
are acute. Its chord P,P; must therefore be at least as great as the 
chord intercepting one-third the circumference, that is, as great 
as the side s of an equilateral triangle inscribed in the circle c. Since 
the length of P,P; can at most be the span d, we have s < d. 
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THE SPANNING CIRCLE OF A FINITE SET OF POINTS 
In § 2 we found that the radius r’ of the circle circumscribing 
: ; ; ; s ‘ 
an equilateral triangle of side sis r’ = ; 1/3. Since s S d we have, 


for the radius r of ¢, 


which was to be proved. 
d 
6. We still want to see whether the bound : 4/3 cannot be 


further decreased for general finite sets of points. Let us apply 
the method of § 4 to the set 7 consisting of the three vertices of an 
equilateral triangle. The circles with two points of T as ends of a 
diameter and the circle through three points of TJ are drawn in 
Fig. 56. The only one of these that is an enclosing circle is the one 
through the three points, the circle circumscribed about the triangle. 
There are no other circles with which to compare this one, so it is 
J 


SAN 
= ; 


Fig. 56 Fig. 57 
the spanning circle. The side of the triangle is the span d and the 


SErese . ; 
radius of the circumscribed circle is : 4/3. Since this particular 


d 
finite set has a spanning circle of radius . 4/3, it would be impossible 


to reduce that bound for general finite sets. 

Here we have used § 4 to prove that the circumscribed circle 
is actually the spanning circle for the set JT. That it is the spanning 
circle seems fairly obvious without a proof, but it is not completely 
self-evident. The spanning circle of the vertices of an obtuse angle 
triangle is not the circumscribed circle, but the circle having the 
longest side of the triangle as diameter (Fig. 57). 
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17. Approximating Irrational Numbers by 
Means of Rational Numbers 


22 
The value = is an old and very familiar approximate value for 
7 
mz, the area of a circle of radius 1. Also 1/2is_ . rly _ Precisely 


what do we mean by such statements as these? The words ‘‘ap- 
proximate” and “nearly’’ do not have a real place in mathematical 
speech, yet these statements must have some significance. Why is 


22 
= invariably used to approximate z, in preference, say, to a frac- 


tion with denominator 8? 

1. If any number w is given, then fractions or (as a mathema- 
tician would say) rational numbers that are arbitrarily close to it 
can be found. For example, if w= a = 3.14159---, then the 
fractions 


31 314 3141 
) ens 4] = ——_.,::-: 
1000 


are getting closer and closer to x. The first fraction clearly differs 


1 3 
from a by less than aoe since = would already be too large, 


1 
the second differs by less than a etc. In the same way we can 


approximate any number w to any degree of accuracy if we only 
know its decimal expansion. 
This proposition has a certain esthetic blemish due to its connection 
with the number 10. Our whole system of decimal notation sets 
the number 10 apart from all the others, but the choice of 10 is 
merely a matter of convenience and custom, and it does not reflect 
any mathematical distinction attached to it. The proposition can 
be freed from this blemish by being put in the form of the following 
theorem, where the arbitrary number z takes the place of 10, 107, -- -. 
Theorem. 1. If w is any number and n is any whole number, then there 
1s a rational number ~, with denominator n, which differs from w by less 
i m 1 
than —, OXSw——<-. 
n n n 

For example, consider w = 4/2, n= 5. Then w lies between 
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1 and 2, and hence it lies in some one of the 5 intervals between the 
numbers 


6 
(1) I, rt 


I 
Each of these intervals is of length 5? 80 all we need do is pick out 
the last fraction above that is less than 4/2. It will be a fraction 
1 
with denominator 5 and will differ from 2 by less than = The 


same thing can be done for any arbitrary number w and whole 
number n. If g is the largest whole number less than w, we consider 
the rational numbers 


n i Pi n— | , 
(2) Sic Bch 2 8+ eee els 


and pick out the last one that is less than or equal to w. This 


l 
number, which we may call g + —, will differ from w by less than 
n 


1 
—, so we have 
n 


l 1 
<yw— se 
(3) 0<w (e+ =) <=, 


which proves the theorem. 
Let us find the numbers (1) between which 4/2 falls. The 
computation is a little simpler if we eliminate the denominator 5. 


We multiply everything by 5 and look for the position of 5V2 
relative to the numbers 


(la) Be 6,728; 8, 10: 
Since 5V2 = V25V2 = V 50, we are really looking for the largest 


whole number less than V50. Now from 49 < 50 < 64 we have 
7 < V50 < 8, and hence 7 is the number we are seeking. Dividing 


— 7 8 
by 5, we see that V2 lies between = and — and we have 


eee i 1 

ob 0S V2-—-—<~—. 

(4) =v 5 = 5 
The general proof of theorem 1 can also be simplified if we 
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similarly dispense with the denominators. We consider nw instead 

of w and look for the largest whole number that does not exceed nw. 

Calling this number m, we have m <nw <m+1, and hence 

0 Snw—m<1. Dividing by n, we obtain the required result: 
] 


m 
0S w-—-—< -., 
n n 


2. The fact that there are rational numbers close to 2 and V2 
is therefore nothing unusual. What, then, is the significance of the 


statements that a is approximately = and that V2 is approxi- 
mately = It lies in the fact that theorem 1 tells us only that 
= will differ from V2 by less than = while actually it is much 
closer. It is easy to verify the inequality 

17 


7 a 
a ee 
5 = = ¥2" 


; 7 eS : 
and therefore V2 differs from = by less than = does. That is, 
we have 

— f 17 


oe 
. 2 


—— 


7 
5 Sp" 


] 
which is much smaller than the = guaranteed by theorem 1. Sim- 
larly, the inequality 
34. 10 3 4 1 
a a 


which was given by Archimedes, yields 


22 <(s a) (s 4 oo) i 
- ee a ao ae 


] 
to be compared with C from theorem 1. 


Still, this idea of a rational number ‘“‘much closer” to the given 
number is not yet a mathematical concept. We are still looking 
for a definite unequivocal meaning. It is furnished by: 

Theorem 2. If w is an irrational number and N is any whole number, 
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then there 1s a fraction —, whose denominator does not exceed N, and 
n 
; 1 , a 
which differs from w by less than nts Furthermore, there 1s an infinite 
n 


eS ; 1 
number of fractions — that differ from w by less than —. 
n n 


: ; 7 
Applied to the two previous examples, this asserts that /2 — " 
; 1 1 22 : 1 
is less than — = — and — — zis less than —. Eventhough 
5? 25 7 49 


these limits still exceed the actual difference, this theorem represents 
an essential improvement over the first trivial theorem. 

To prove the theorem we do not consider the number Ww and the 
largest whole number less than Nw alone. Instead, we consider 
the whole series of numbers 


w, 2w, 3w,:::, Nw, 


and the largest whole numbers less than each of these: 


&1» 2) 83° °°» En: 
Then we have 


0<w—g,<1, 0<2w—g,.<1, 0<3w—g,<1,..., 0<.Nw—gy<l. 


The proof will be clearer if we first carry it out for a particular 
example. We take w = /2, N = 13 and have 


VQ2= 1414---= 140.414--- 
oA/9 = 2808~-. = Sa eames 
8V2 = 4,.242---=— 4-4 0,242--- 
4V2— 5.656---= 5+ 0.656- 

b/ 2 VOT. ew TA OO - 
6V/2=—= 8.485---= 8+ 0.485- 

7V2= 9.899:--= 9+ 0.899:-- 
94/2 = 11,318 + += 1b 0315 - 
99/2 = 12,727+ «+= 12+ 0.727- 

100/29 a= 14.142--+ = 14+ Oe 
11V2 = 15.556--- = 15 + 0,556--- 
12/2 = 16.970--- = 16 + 0.970--- 
13/2 = 18.384:-- = 18 + 0.384°-- 


Of all the amounts by which each of these exceeds the whole number 
just under it, the fifth is the smallest, 5V2—7=0.071---+, and 
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the 12th is the largest, 12V2 = 16 + 0.970--- = 17 — 0.030:-- 


or 17 — 12V2 = 0.030---. We think of taking all 13 of the 
amounts by which the numbers exceed the whole numbers just 
under them and putting them in order according to size. Then 
we have 13 numbers arranged between 0 and 1. They form 14 
intervals which may be of various sizes. One of these intervals must 


have a length of less than — For if they all were greater than 
or equal to = then all 14 together would be at least = = | long, 
and they could have the total ‘deal 1 only if each was exactly 7 —. 
But if they were all exactly = then V2, being one of the as 
numbers, would exceed 1 “a a whole number of fourteenths, 
/9=1+ > That is, V 2 would be rational. Since we have 
assumed that w is irrational, and have proved the irrationality of 
/2 in Chapter 4, the intervals cannot all be = so there must be 
some interval that is less than — We only know that such an 


interval exists, but not which one it is. However, we can call the 
lower end of the interval aV2—g,—=7, and the upper end 


b\/2 — g, =1, Then we have 


a oa ] 
0<n—1e= (6V2—g,) — (aV2—g.)< =, 
and therefore 


<b 38) V2 — (2, ~t)<— 


Since a and b are some two of the numbers 1, 2, 3,--- 13, and a 
and b are unequal, their difference 5 — a is, apart from sign, again 
one of these numbers, i.e. — 13 <b —a <13. Therefore (b—a)V 2 
or (a — 6) V2, whichever is positive, is one of the 13 multiples of 


/ 2: we shall call it nV2. Furthermore nV2 differs from the whole 
number (g, — £4 OF Z, — Z») just below or just above it by less 


1 
than si Calling this whole number m,’ we then have 


a V/2 oe < 13 
— semen nN —— Mm ee ‘ 
14 egies 
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Dividing by n we obtain the final result 


The general proof proceeds in exactly the same way. The ir- 


rational number w takes the place of V2, NV replaces 13, V+ 1 
replaces 14, and we find that there is some n < WN for which we have 


(5) 1 m 1 
Nes n eat Say 


This proves the first part of theorem 2. 
Since we have n < N, formula (5) implies 


] 
(6) ~ ore 


Therefore we have found fractions that differ from w by less than 


] ; : 
~;: In (6) the number W does not appear, but it was used in 
n 


finding the fraction — with denominator n < W. 
n 


In the previous numerical example n is 5 for NV = 5, 6, 7,°- eee 
and we have 
1 


1 ee 
0< 6V2—T=<— gee 8 
ict <— e aT 


When WW reaches 12, n becomes 12 and we then have 
17 1 


= 1 on 
0< 17 + 18V2 —— eee 8 
Ss = 12” = 12 i. 122 


For WV = 13, n is again 12, ete 
The second part of theorem 2 asserts that for any given irrational w 
there is an infinite number of fractions — that have the property (6). 
n 
To prove this we need only show that after each fraction — satisfying 
’ n 


m 
(6), we can always find another one — that lies still closer to w 
n 


and still satisfies (6). Since w is irrational, it cannot be equal to 
any fraction. Therefore 
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is not 0 and hence differs from 0 by some definite amount. Because 


1 
of this there must be a fraction Ww? with WN’ sufficiently large, that 


is closer to 0 than w — — is. That is, we have 
n 


m 
<——u, 


m 
<w——orodg< 
n N’ n 


] 
< 0 
(7 <=, 


. m ° e ° ° e e 
according to whether w — — is positive or negative. Using WN’ in 
n 
place of WV in what we have already proved, we can find a fraction 


/ 


~, with n’ < N’, that satisfies the inequalities 
n 


(8) a <w— Me < B sat gabe: 
(N’ + 1)n’ no (N’ + 1)n’ 
corresponding to (5). From this we see that the requirement, 
] m' ] 
wi ne ce | + Baas iy < 72 


is satisfied. Furthermore, since (8) is true, we certainly have 
1 2 m’ 2 1 
ce Ww Site aetna pooner nes tee nene eee 
N+ 1 vn NET 


] 
that is, w — “ differs from 0 by less than ————. But accor- 
n’ N’+1 
] 
ding to (7), w — ” differs from 0 by more than We Therefore 
n 


/ , 


— is closer to w than is —, and consequently — and — are dif- 
n n ees 
ferent. 
3. Therefore there is no last fraction — having the property (6). 
n 


Each one is followed by yet another, so there are infinitely many 
of them, as theorem 2 asserts. While the trivial theorem 1 admits 
all whole numbers as denominators, this theorem picks out special 
denominators with the property (6). We can call these “good 
denominators.”’ Our earlier numerical example shows that 2, 5, 
12 are good denominators for the approximation of V2. Further 
computation reveals the further good denominators 29, 70, 169, ---. 
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For x the number 7 is a good denominator. In fact, it is a good 
] 


497 
One might wonder if there 


: 22 
deal better than is assured by theorem 2. For we have - —™< 
1 
which is much less than — = —. 
fi 49 
are still better denominators than we have found, denominators for 
: 1 m I ag 
which we will have, perhaps, — —; < w — — < — orsome similar 
inequality. i . 
We shall now show that such an improvement of theorem 2 is 
in general impossible for all irrational numbers w. We shall con- 
sider the particular value w = V2, and shall show that every frac- 


(ee ae 1 
tion — differs from V2 by more than f° 
n n 
We first consider the fractions that are greater than V2. For 


— >2, since we have V2 < 1.45, the difference — — V2 is 
n n 


Peecat? | m 
greater than 0.55, while oA = ais less than 0.55. For 1.555—<2 
n n 


m — : : 1 1 
we have — — V2 => 0.10, while — is less than —- = —— <0.10, 
n 3n? 3°4 12 
ope ee a 
since n is now at least 2. Now if—is between V2 and _ 1.55, 
we have = 


where g is somié positive whole number. Consequently g is at 
least 1, and 


(™ + v2) ("- v2)2+ 
n n n 
and then 
aint 1 
haar > — 
n nn” m pote 
—+v2 


m m aoe 
Since — is less than 1.55, we find — + V2 < 1.55 + 1.45 = 3, 
n n 
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] 
so its reciprocal is greater than— and 
m 1 
——vV2>—, 
n 3n 


as was asserted. 


Finally, for 0 < ” < V2 we can similarly argue: 
n 


n n n° 
_ m 1 1 Bee 1 
Sa <a 
- om 4 V2 n° 24/2 ae 
n 


18. Producing Rectilinear Motion 
by Means of Linkages 


James Watt’s original steam engine was equipped with a remark- 
able mechanism called Watt’s parallelogram because of its shape. 
The apparatus, which is shown schematically in Fig. 58, consists 
of five rods hinged together at C, D, E, F. At A and B the rods are 
held in place by means of pivots that allow the rods to rotate. } 
All the hinges are made in such a way that the rods can never move 
out of the plane. The piston rod is attached at F. The purpose 
of this apparatus was to force the end of the piston rod to move along 
a straight line. This is necessary in order to keep the piston from 
becoming jammed in the cylinder. However, it can be shown 
mathematically that the point F does not actually follow exactly a 
straight line. Rather, it moves along a curve which is so close to 
being a straight line that the mechanism serves the purpose for which 
it was intended. 

The parallelogram C'DEF, which gives its name to this linkage, 
is not really an essential part of it. Its purpose is merely to magnify 
the usable part of the motion. The essential part is the linkage 
ADCB. For motions that are not too great, this part causes the 


1 In the schematic figures, hinges whose positions are fixed will be represented 
by black circles, movable hinges by hollow circles. Dotted lines are auxiliary 
lines and do not represent rods. 
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midpoint M of the rod DC to approximately follow a 
straight line. If the linkage is constructed with 


Piston rod 7 


Fig. 58 Fig. 59 


AD = DE = CF = BC and DC = EF (the rod AE does not bend 
at D), the points A, M, F will always lie on a line because of the 
similarity of the triangles AEF and ADM. Since, because of the 
similarity of the triangles, AF is twice AM, the point F has a motion 
that is “similar” to the motion of M, but is just twice as great. 
The motion of M can just as well be magnified k-fold. We keep 
AD = BC so that the midpoint M of CD will approximately follow 
a straight line, but we increase the size of the parallelogram DEFG 
(Fig. 59) to obtain 


AD: AE=DM:EF=1:¢k. 


In order to investigate the motion that Watt’s parallelogram 
actually produces, we could now restrict ourselves to a discussion 
of the motion of M. It is easy to show that the motion is not 
rectilinear by considering several extreme positions of the linkage. 
However, since this particular motion is not of importance to our 
other topics, we shall not study it further. 

2. The production of straight-line motion by means of linkages 
has been of considerable importance in the history of machine 
design, and the problem has been taken up by a large number of 
workers. But the particular use to which Watt put his linkage is 
not important any longer. Ona modern steam engine the end of the 
piston rod is held in a straight line by an entirely different sort of 
mechanism, a “‘crosshead”’ that slides between parallel rails. It 
may be that the early designers were led to use linkages rather than 
the crosshead principle because they had an incorrect idea of the 
size of the frictional forces involved. 

The linkage problem has not only occupied practical designers, 
but it has also attracted the attention of pure mathematicians. 
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The mathematicians have naturally considered the problem in its 
strictest form, to find a linkage in which one pivot will follow a 
theoretically exact straight line. The first to take up this problem 
was the great Russian mathematician P. L. Tschebyscheff (1821 — 
1894), who studied Watt’s parallelogram and its possible im- 
provements without finding a linkage that would produce accurately 
a straight line. Many other fruitless attempts were made during the 
first half of the last century, and finally mathematicians began to 
doubt whether an exact solution were possible. 

Then, in 1864, Peaucellier devised a linkage that produces 
straight-line motion. This apparatus is called Peaucellier’s cell. 
Then, as so often happens in the history of discovery, a great many 
solutions of the problem were found. Linkages were found that 
would produce various curves of which the straight line is only a 
particular case. 

3. We shall now consider Peaucellier’s cell. This linkage 
produces not only straight-line motion but also an “inversion” of 
the plane. An inversion is a particular kind of “mapping” of the 
plane onto itself. By a mapping we mean an picturing of the plane 
in which each point has an image. Some simple mappings are: 
(1) reflection in a line, which we used repeatedly in Chapters 5 
and 6; (2) parallel displacement, in which the image of each point 
is found by displacing the point itself by a fixed amount in a fixed 
direction; (3) the rotation of the plane about a fixed point through 
a fixed angle; (4) the stretching of the plane about a point (the 
center), in which each point is moved out along the ray from the 
center so that its distance from the center is increased by a fixed 
ration 1:4. The image of a point need not always be different 
from the original point. In parallel displacement, every point is 
mapped on an image that is different from it, but in rotation or 
stretching the center has itself as its image. In the case of reflection 
in a line, each point of the line is mapped onto itself. 

These examples should serve to explain what we mean by a 
mapping. Now we must see just what type of mapping an inversion 
is. A circle C with center O, the center of the inversion, is given 
in Fig. 60. To find the image P’ of any point P, we draw a ray 
from O through P. This ray will cut the circle at some point Q. 
We then determine P’ on the ray to satisfy OP: B= OO - OF’. 
If we call the radius of the circle a and write OP = pi: as, 
then the proportion can be replaced by the equation 

iy conn. 
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From the definition of an inversion, it is clear that the image of 
every point inside the circle of inversion C is outside the circle and 


C 


Fig. 60 


vice versa. Each point of the circle of inversion is its own image. 
Because of this, an inversion is often called “reflection in a circle’. 
Just as with ordinary reflection, inversion has the property that the 
image of the image is the original point itself. 

4, So far as single points are concerned, all the properties of an 
inversion are immediately evident from the definition. New 
questions arise if we consider the images of curves, but we cannot 
undertake to make a complete investigation of this question. The 
most important result is that every line or circle is mapped by an 
inversion into either a line or a circle. We shall require only a part 
of this statement, namely, that the image of a circle that passes through 
the center of inversion is a line. We shall prove this assertion. 

Let k be the circle passing through the center O, and let its diameter 
be OA = d (Fig. 61). The circle of inversion K has radius a. The 
image A’ of A lies on the extension of the diameter OA, and the 
distance OA’ = d’ must satisfy the equation dd’ = a*. Now we 


Fig. 61 


shall prove that the image of k is the line perpendicular to OA’ 
through A’. To do this we must show that every line through O 
meeting the circle k at a point P meets the line at P’, with OP: OP= a?. 
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If we draw the line AP, we have two right triangles OAP and OP’A’ 
which are similar because they have acommon angle at O. Therefore 
we have the proportion 
OP : OA == O04 + OF", 

eee ee fF, OF" = 1’, 

YF = an. 
But using dd’ = a*, we have 

| ae 
and this is just the relation between a point and its image required 
for an inversion. 

Beside this fact, we shall also need a result that is proved in elemen- 
tary geometry. If two secants are drawn through a fixed point 
outside a circle, the product of one and its external segment equals 
the product of the other and its external segment. In the notation 
of Fig. 62, this theorem can be written as 5,5, = 595. Its proof 
follows directly from the similarity of the triangles AP,P, and AP,P,, 
which have the common angle A and in which the angles at P, 
and P are equal because they intercept the same arc P,P,. 

5. With these preliminaries out of the way, we may now return 


to Peaucellier’s cell (Fig. 63). It consists of four rods of length c, 
hinged to form a rhombus PQP’R, and two rods of length b >, 


Fig. 62 Fig. 63 


hinged to two opposite vertices Q.and Rofthe rhombus. The other 
ends of these two rods pivot about the fixed point OQ. 

Because of the symmetry of the system of rods, the points 0O, 
P, P’ are always on a line, the axis of symmetry of the figure. We 
draw a circle about Q with radius ¢. This circle passes through P 
and P’ and crosses the rod OQ and its extension in two points S 
and J. Then OPP’ and OST are two secants of the circle, to which 
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we can apply the elementary theorem mentioned above. It gives 
us the result 


OP: OP' = OS-OT=(6—d(b +0 =B—A. 


Therefore the product OP - OP’ is aconstant. If we set a, = 6? — ¢?, 
then a is one leg of a right triangle with hypotenuse 4 and other 
leg c. The equation OP- OP’ = a* then shows that P’ is the 
image of P under an inversion in the circle of radius a about 0. 

Peaucellier’s cell therefore sets up an inversion in the part of the 
plane that can be reached by P and P’. To obtain rectilinear 
motion, we need only cause P to follow a circle passing through O. 
Then, from what we know of inversions, the image P’ will move 
along a straight line. There is no difficulty in forcing P to follow 
a circle. We merely hinge a rod at P and pivot its end at a fixed 
point Z. In order to make the circle pass through O we must make 
the length of the rod ZP equal to the fixed distance OK. 

With the added rod, Peaucellier’s cell consists of 7 rods. The 
point P is restrained to an arc of a circle which, perhaps only when 
extended, passes through O, and P’ then moves along a straight 
line. This line will be perpendicular to OX. 

6. Other linkages which produce inversions, and which can 
then be used to produce rectilinear motion (just as with Peaucellier’s 
cell), have been found. One of these, devised by Hart, requires 
only 4 rods instead of the 6 of Peaucellier’s cell. The 4 rods are 
hinged together as they would be to form a parallelogram ABCD, 
but the parallelogram is ‘‘turned inside out”. That is, the points 
A and C are pulled apart until the parallelogram collapses into a 
straight line, and then D and B are brought to the same side of AC 
(Fig. 64). The two “diagonals” AC and DB are always parallel 
because the triangles ACB and ACD are congruent and therefore 
have equal altitudes. Now the linkage is held in any one position 


Fig. 64 


and the points O, P, P’, R are marked on the 4 rods on a line parallel 
to these diagonals. We shall now show that these points always lie 
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on a line parallel to the diagonals, regardless of the position of the 
linkage. 

If we first look at the triangle DAB in Fig. 64, we notice that OP 
is parallel to DB in the original position of the linkage. For this 
reason, we have 


(1) AO : OD = AP: PB. 


We have proved (1) only for the original position of the linkage, 
but since it is a proportion involving only the lengths of segments, 
it must remain true for every position of the linkage. From (1) 
we now have OP || DB for every position of the linkage. Now, 
looking at the triangle ADC, we see that OP’ is parallel to AC in 
the original position of the linkage, so we have 


(2) DO. O04 Sob?’ : PC. 
This proportion between lengths of segments remains true for all 
positions of the linkage, so we have AC'|| OP’. Since we have 
proved OP || DB, DB || AC, AC || OP’, we have OP || OP’, and 
OP and OP’ are the same line since they are parallel and have the 
point Oincommon. This line is parallel to DB. Similar reasoning 
applied to the triangle DCB shows that P’R || DB and we find that 
R also lies on the line containing O, P, P’. 
Furthermore, because the three lines are parallel, we have 

OF pp = AU v AD, 

a 34a oC: DA. 
These proportions can be replaced by the equations 
(3) OP ADs AQ DB, 
(4) OF -DA = DO: AC. 
If we multiply corresponding sides of these equations together and 
divide by AD?, we find 


,.  f0-DO 
(5) OP « OR? == om AC: DB. 
On the right side the numbers AO, DO, AD are fixed lengths that 
are determined by the apparatus, while AC and DB are the lengths 
of the diagonals and apparently depend on the position of the 
linkage. However, the product AC + DB is a constant. For, drawing 
BE || DA and BF | AC, we have 


(6) AC- BD = (AF + FC)(AF — FC) = AF* — FC?. 
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From the right triangles AFB and FCB and the Pythagorean theorem 
we find 
AF? + FB? = AB?, 
FC? + FB? = CB?, 
and then, by subtraction, 
AF? — FC? = AB* — CB?. 
This, with (6), gives 
(7) AC’: BD = AB? — CB?, 
and the right side is a constant that does not depend on the position 
of the linkage. Now from (7) and (5) we have 
AO: DO 
6 OP xe es 
(8) OP: 0 AD? ( CB?), 
where the right side is a constant that can be determined by making 
certain measurements on the linkage. Since O, P, P’ lie on a line, 
the point P’ will be the image of P under an inversion if we keep 
the position of O fixed. The circle of inversion has its center at 
O, and the square of its radius is given by the right side of (8). 

The proof that Hart’s linkage produces an inversion is now com- 
plete. In order to produce rectilinear motion, we need only add a 
fifth rod that will restrain P to a circle through O. 

7. There are also linkages that produce rectilinear motion 
without depending on an inversion. The double rhomboid of 
Kempe (Fig. 65) is a particularly ingenious arrangement. By 
the term “rhomboid” we mean a four-sided figure in which two 
pairs of adjacent sides are equal. In a rhomboid the two angles 
formed by two unequal sides are always equal. Let ABCD and 
BCEF be two hinged rhomboids in which the ratio of the smaller 


Fig. 65 


side to the larger is the same in both, and in which the larger side 
of the smaller rhomboid is equal to the smaller side of the larger 
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rhomboid. Then in Fig. 65 we have AD = AB, CD = CB = CE, 
FB = FE and CB: AD=FB:CB. Since the vertex F of the 
smaller rhomboid lies on one side of the larger, the two rhomboids 
have the angle at Bin common. Therefore the angles at B, at E£, 
and at D are all equal. Now two convex quadrilaterals are similar 
if their sides are in the same ratio and they have one pair of cor- 
responding angles equal. No matter how the linkage is moved, 
the two rhomboids ABC'D and BCEF will always remain similar. 

If two opposite sides of a rhomboid are extended until they inter- 
sect, they form an angle. These angles: EXF and CYB are evidently 
equal for similar rhomboids. If we draw Cx || AB through C, 
we have / DCK = / CYB, since they are corresponding angles 
of parallel lines. We also have 7 <CE = / EXF, since they are 
alternate interior angles. Therefore all four of these angles, which are 
marked in the figure, are equal. Consequently CX is the bisector 
of the vertex angle of the triangle CDE, which is isosceles by hypo- 
thesis, and it is perpendicular to the base DH. Since CK was 
drawn parallel to AB, the line DE will then always be perpendicular 
to AB. 

Now if we keep D fixed and move AB in such a way that it always 
remains parallel to its original position, then the perpendicular from 
D to AB will remain fixed. But the point £ is on this perpendicular, 
so it can have only rectilinear motion along this perpendicular. In 
order to keep AB always parallel to itself, we add a rod BG of length 
BG = AB=AD. We keep the end G of the rod fixed at the point 
determined by the condition DG = AB. Then ABGD is a rhombus, 
so its opposite sides AB and DG remain parallel for all positions of 
the linkage. That is, AB always remains parallel to the fixed line 
DG and hence is always parallel to itself. Thus rectilinear motion 
is finally attained. 

8. The double rhomboid of Kempe can be arranged to give 
rectilinear motion in another especially elegant way. We supply 
ourselves with two copies ABCDEF and A’B'C'D’E’F’ of the double 
rhomboid linkage, one of which is the mirror image of the other. 
We connect these two so that that they have the point C'in common, 
and so that DCE’ and D’'CE become rigid rods that do not bend at 
C (Fig. 66). The bisector CX of angle DCE is, as we have seen, 
parallel to AB, and similarly the bisector CX’ of angle D’CE’ is 
parallel to A’B’. Since the angles DCE and D’'CE’ are vertical 
angles by construction, the two bisectors C¢ and Cx’ lie in a line. 
The lines AB and A’B’ are parallel to this line, so they are parallel 


127 


RECTILINEAR MOTION BY MEANS OF LINKAGES 


to each other for every position of the linkage. More than this, 
the line A’B’ always lies on the extension of the line AB. Since 


Fig. 66 


we know that the two lines are parallel, all we need do is prove that 
the point C is equidistant from each of these lines AB and A’B’. 
We shall do this by showing that the two rhomboids ABCD and 
A’B’CD’ are always congruent for all positions of the linkage. In 
the first place, since the angles DCE and D’CE’ are vertical angles, 
they are equal. By § 7 these angles are twice the angle between 
two opposite sides in each of the rhomboids. Corresponding sides 
of the two rhomboids are equal. If we can show that a rhomboid 
is completely determined by the lengths of its sides ana the angle 
between a pair of opposite sides, then our two rhomboids will be 
congruent, since they are determined by equal sides and an equal 
angle. To see that the rhomboid is completely determined we 
construct the rhombus BCDH inside the rhomboid ABCD (Fig. 67). 
The points A, H, C lie on a line because of symmetry.* Since HB 


Fig. 67 


is parallel to DC, the angle ABH is equal to the angle between two 
opposite sides. It is therefore determined as half the given angle 
DCE of Fig. 66. Since AB is given and BH is equal to the given 
BC, the triangle ABH is completely determined. Then D is deter- 
mined as the mirror image of B in AH, and C is determined as the 
fourth vertex of the rhombus BCDH, three of whose vertices, B, 
H, D, have already been determined. 

Therefore, as was asserted, the two halves of Fig. 66 are always 


2 This figure is exactly that of Peaucellier’s cell. 
128 


PERFECT NUMBERS 


congruent to one another. Consequently the height of C’ above 
AB equals its height above A’B’, and then AB and A’B’ lie on the 
same line. 

Now if we keep A and B fixed, the rod A’B’ can move but must 
remain on the line AB. Thus we have obtained rectilinear motion 
from this new linkage. 

This linkage does considerably more than the previous ones. In 
the previous mechanisms there is a single point that moves along a 
line. In this case a whole rod A’B’ moves along the line in which 
it lies. Since any figure can be rigidly attached to A’B’, this ap- 
paratus accomplishes a parallel displacement of any figure or even 
of the whole plane by which all points move along equal and parallel 
line segments. 


19. Perfect Numbers 


Book IX of Euclid’s Elements is the third and last book that is 
occupied with arithmetic. This book includes the proof of the 
infinitude of prime numbers, which we have reproduced in Chapter 1, 
and it concludes with a discussion of so-called perfect numbers. 
Perfect numbers are also mentioned by Plato, especially in an 
enigmatic passage in his Republic where, in an obscure discussion of 
eugenics, he introduces the ‘“‘nuptial number”’’. 

The subject of perfect numbers and the theorems that were later 
proved concerning them are now no more than an interesting 
curiosity in the body of modern mathematics. But we shall discuss 
this minor topic because in the method Euclid used there burns 
the tiny spark of an idea which, as we shall see in the next chapter, 
was rekindled by Euler and has flared up into a great flame, the 
modern theory of the distribution of prime numbers. 

Euclid defined a perfect number as one that is equal to the sum 
of all its divisors. For example, the number 6 is a perfect number, 
since its divisors are easily found by trial to be 1, 2, 3, and 1+2+3=6. 
In this definition of a perfect number, the number itself is obviously 
not counted as one of its divisors. By continued trials, the next 
perfect number after 6 is found to be 28 =1+4+2+4+7+4 14. 
The next perfect number is too large to be easily found by trial, 
but this is unimportant. What we really want is a general systematic 
method of finding perfect numbers. 


129 


PERFECT NUMBERS 


1. It is clear that a prime number cannot be a perfect number. 
1 and p are all the divisors of the prime p, and p itself is not to be 
counted. Therefore the sum of all the divisors is merely 1, which is 
certainly not equal to . 

2. Having taken care of this very simple case, we can consider 
one only a little more complicated. The number 9 is not a prime, 
but it is the square of a prime. By trying the numbers from 1 
through 8, we see that 1 and 3 are all the divisors that are to be 
counted. Since 1+ 3 = 4 is less than 9, the number 9 is not 
perfect. 

Correspondingly, the numbers 1 and p are obviously divisors of 
p®, the square of a prime. Now we cannot verify by trial, as in 
the case of 9, that these are all the divisors that are to be counted. 
We can, however, prove that these two divisors exhaust the possibilities. 
This was done by Euclid after he had established the necessary 
lemmas. Our proof is basically the same as Euclid’s except for the 
way in which it is formulated. It is based on the theorem of the 
unique factorization into prime factors, which we proved in Chapter 
11. If p? had any other divisors, it could be factored in some way 
different from p? = -. In fact the other factorization would 
involve some prime different from #, and this is impossible because 
of the unique factorization theorem. 

3. More generally, we can show in the same way that no power 
of a prime can be a perfect number. Just as for p*, we see that all 
the divisors of p* that are to be counted are 


(1) i AS a tee eee 

Since they form a geometric series, their sum is given by a familiar 
formula, one that Euclid proved expressly for the present purpose. 
In fact, 


°—] 
(1a) 1p dealt p pet o£ 


bork 

Because the denominator p — 1 is 1 for the smallest prime p = 2, 
and is larger for all larger primes, the fraction in (la) is never any 
larger than its numerator p* — 1. Therefore the sum of the divisors 
of p* is never any larger than p* — 1. It is less than p*, and hence 
p* is not a perfect number. 

4. Now we can see how to discuss a number like 72 = 28 - 33. 
which contains two primes. We shall immediately consider the 
number p97, The divisors of p%g2, including the number 3? it- 
self, are 
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1, p, p*, Pp, 
(2) q, Ib; 9p", qp*, 

Fb, Ph*, 9°6*, 
that is, all numbers of the form p%g? where « goes up to 3, £# to 2. 
It is obvious that all of these numbers divide p®q?. It again follows 
directly from the theorem on unique factorization that these include 
all the divisors. The table (2) gives us a complete view of all the 
divisors of our number. In order to find the sum of the divisors, 
we note that the second row is merely the first row multiplied by g, 
and that the third row is the first multiplied by @. . The sum of 
the first row is the sum of the geometric series 1 -+ p+ 72+ £, 
so the complete sum is this sum taken 1 time plus g times plus g? 
times. Therefore the complete sum is 


D=(1+q+@)(1+p +p? +f). 
Analogously, the sum of the divisors of NV = pq¢’ is 


ee ee ee 
(2a) bs aa po aoe | 
Se 
but in these formulas we must remember that we have included 
the number WN itself among the divisors. 

5. This last result can be generalized to numbers V containing 
more than two primes. For W = p%gr*, besides a table like (2), 
we will also have the whole table multiplied by 7, and by 72, and 
so on until it is finally multiplied by r*. In all, the table is mul- 
tiplied by (l+r+7?+---+ 7°). If W has still another prime 
factor, then the sum of the divisors has another corresponding factor. 
In general, for V = p%q’r°---, the sum of the divisors of WV is 


D=(1+p+p?+ +--+ 7) (1+q+q?+--:+4°) (1+r+r2+-+-+72)+-- 


—S—=— eee, a#$"mnym9w lt lt lt 


where, again, we are including the number WN itself. 

6. Basically all of this is included in Euclid, but it is given 
expressly only for the case NV = p- 2°, In this case, where WN is a 
product of two factors, one a power of 2, the other a prime f to the 
first power, formula (2a) becomes | 


ee re 
ee weal hk 
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If N is a perfect number, we have D = 2.N, since N is included 
among the divisors, and 


D = (24 — 1)(p +1) = 
Putting in the value of NW, we find 
(24 — 1)(p + 1) =2-p- 2 = wep, 
21(p + 1) — (p+ 1) = 2p, 
r+ — p+ 1, 
(4) p = 2e+1 — 1. 
Therefore if p is not only a prime but is also equal to 2°+? — 1, 
then W is a perfect number. The number 2°+! — 1 is not always 
a prime for every value of b, but we have Euclid’s theorem: 
The number N = (2"+1—1)-2” is a perfect number for all numbers n 
for which 2"+1 — 1 1s a prime. 
7. Let us try the values n = 1, 2, 3,-~: im order to find some 
further perfect numbers: 


226 


bo 
| 
wo 


) ; 
N = ( )-2— 7-4=28 
Ns )- 23 = 15-8 not perfect, 15 not prime, 
N = (28 —1)-24= 31-16 = 496, 
N= ( )-25— 63-32 not perfect, 63 not prime, 
N = ( ) + 28 — 127-64 = 8128, 


Euclid’s theorem immediately gives rise to a new problem: For what 
values of nis 2"+1 — 1 a prime? One step towards its solution can be 
made at once. If m+ 1 is a composite number, say n + 1 = w, 
we have 2"+1 — 1 = 2¥* — 1 = (2%)"— 1. Now, by the formula 
for the sum of a geometric series, we find 
x° — 1 = [x2 + x?-2 4 oe le — Qi], 
and letting x = 2“ we obtain 
(aeye — 1 = [(aeyr + = + (2) +1) (2%) — 1], 

a product of two factors. Therefore 2"+1 — 1 cannot be a prime 
unless n + 1 is a prime itself. In the continuation of the above 
table, the next prime after 7 is 11, so the next value that must be 
tried is n= 10. Since 244 — 1 = 2047 = 23- 89 is not a prime, 
the number (21 — 1) - 21 is not a perfect number, but the fac- 


torization of 2047 has required a very considerable number of 
trials. Several further numbers have been found to be perfect, but 
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no complete rule for finding them all has ever been discovered. 
Another question that has remained unanswered is whether there is 
an infinite number of these perfect numbers or whether there is 
finally a last one. 

8. Euclid’s result only goes as far as this theorem, but there is 
the most varied evidence that more was known at that time. For 
one thing, Jamblichos states without further explanation that there 
are no even perfect numbers other than those given by Euclid. 
It is not known whether this was proved by the ancients or, if so, 
how they did it. However, a proof based on the formula (3) 
has been given by Euler, and this formula is essentially contained 
in Euclid’s work. Even though it is not important to the following 
topics, we shall reproduce this interesting proof. 

Let V be any even number. Then 2 is one of the prime factors 
of V and it appears to some power, say 2". The remaining prime 
factors of WV are odd, and together they represent some odd number u. 
Then we have V = 2"u. If we form the sum of the divisors D of 


N according to (3), the first factor is 
VA ag eae | 
ont iy 

2—1 


and the second factor is exactly what would arise if (3) were used 
to find the sum U of the divisors of u. We now have 


(5) D = (2"1.— 1)0, 


which must equal 2. if WV is perfect, since we have included VW among 
the divisors. Therefore if NV is perfect we find 


(2°42 — 1)U = 2N = 2+ 2% = 2"41y, 
2 Og, 
2741/0 —u) = U= (U — u) +g, 

(6) (2"+1 — 1)(U — u) =u. 


Now U is the sum of the divisors of uw including uw itself, so U — u 
is the sum of the divisors of u, not including uw itself. Also, (6) 
implies that (U — u) divides u if NV is a perfect number, that is, 
(U — u) is a divisor of u. If U — wis a divisor itself and is at the 
same time the sum of all the divisors, then it must be the only divisor 
of u. Since 1 is certainly a divisor of wu, it must be this single divisor 
U — u, and u must be a prime. Then (6) simplifies tou = 2"+1—1, 
and we have NV = 2"(2"+1— 1). Every even perfect number is of 
Euclid’s form. 
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9. At this point we encounter a second question: Are there any 
odd perfect numbers? ‘This problem also remains unsolved. No one 
has found an odd perfect number, and it appears very unlikely that 
one exists, but this has not been proved. 

10. A passage in the fifth book of Plato’s Laws also leads one to 
think that Euclid did not include all that was known at his time. 
There Plato recommends that, in a newly-founded city, the number 
of plots of land and of landowners be chosen so that it will have as 
many divisors as possible, perhaps 5040 with 60—1 divisors. He 
points out that the legislators must understand enough arithmetic 
to be able to arrange for cities of any size. This involves the 
problem of the number of divisors of a number. In order to solve it, 
we can use part of our discussion concerning the sum of the divisors. 
For example, all the divisors of pg? are included in the table (2) 
and we can count them by rows, 3-4 = 12. This number includes 
the number 3%? itself, so we must subtract 1 to obtain the number 
12—1 of proper divisors. More generally for V = pg’ the number of 
proper divisors of NV is 


P = (a +-1)(b 4-1) Se 
and for the completely general case NV = p%g*r°-- +, it is 
P= (a+ 1)(64+ 1)(e+1)---—1. 
In particular, for V = 5040 = 24- 32-5-7 we have 
P= (4+ 1)(2+ 1) 4 9 £1) 4 See 1 9 


This presumably explains Plato’s use of the unusual and clumsy 
notation 60—1 instead of 59; and this notation, along with his 
instructions that the legislators must know the answer to this problem, 
leads one to believe that he was in possession of the solution. 

11. The close connection between the number of divisors and 
the sum of the divisors of a number becomes more obvious if we think 
of them as two special cases of a general topic. This is the sum S$ 
of the sth powers of the divisors of a number WV. For example, 
for VN = 6 we have S = 1? + 2+ 3% For s=1 the sum S is 
the sum D of proper divisors of WV, while for s = 0 each individual 
term is 1 and Sis the number P of proper divisors of N. For s = 2 
it would be the sum of the squares of the proper divisors of WV, and 
so on. For each value of s, a formula analogous to (3) can be 
found by the same argument. The value s = — 1 may be used 
as well as any other. The — Ist powers are the reciprocals of the 
divisors; so, for N = 6, S is 
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The formula analogous to (3) for the sum of the reciprocals of the 
divisors of WV is clearly 


(7) R=(14+— + +o) (1+—4 ret | | ete +5) 

p di q q r i 
All of these sums are part of a single theory which arises from the 
fact that a table like (2) gives us a complete list of all the divisors 
of anumber. The whole basis of the theory was known to the an- 
cients, and the evidence in Plato seems to indicate that they recognized 
its beauty and importance. 


20. Euler’s Proof of the Infinitude 
of the Prime Numbers 


Euclid’s proof of the infinitude of the primes, which we discussed 
in Chapter 1, immediately precedes his consideration of perfect 
numbers. Euler, who took up and extended the study of perfect 
numbers, produced another proof of the infinitude of primes. This 
proof uses the same ideas that are basic in the theory of perfect 
numbers. 

We must make two simple observations before proceeding with 
the proof. 

1, Let AB be a line segment 2 feet long. 

A M Ma + My aR 


Woe | 


If we traverse it from A to its midpoint M, from there to the mid- 
point M, of the remainder MB, from there to the midpoint M, 
of the remainder MB, etc., we always remain short of B, but we 
continually come closer and closer to B. We will have covered 
distances of 1 ft., £ft., $ft., and so on. All these distances together 
will total less than 2 ft.: 

H 


1 ] 
Sos ste = & 


If the segments continually decrease in some other ratio x < 1 
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instead of 4, the same thing is true. To prove this, we again use 
the formula for the sum of a geometric series, 


1x? +++ +x" 


Since x is less than 1, the second fraction on the right is a positive 
quantity that is subtracted from the first fraction, so we have 


Lpatatpers pac 


— x 
' ; ] 
If p is any prime number, we have — < 1, and we can replace x 


1 ; ; ; 
by > in our inequality. We can then assert that if p is any prime 


and n is a whole number, then 


Lisbeeniiect bt Td eo 
(1) 7 De 
p 
2. If we write 
A 1 ] ] ] 
qm let phbosedomiiacs 


yt ee | ad! 
ees ek Berge gy es HetstetsP are 


, 1 1 1 r (— 3) 1 4 
and, in general, 

1 1 1 m 
(2) Hg RT gt gt ee 


The larger m is, the larger A,, becomes, and A,, can be made as 
large as we please by taking m large enough. For example, we 


m 
need only choose m = 1998 in order to have — = 999 and hence 


Am > 1000. In the same way, we can choose m large enough 
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to make A,, greater than one million if desired. } 
3. We let p be any prime and write (1) fer each of the primes 
2, 3, See ap ainctonp: 


ee 27 "2-1 
Tron he hecaeeeaa 
. oO a7 Ss — 1 


Now we form the product R, of all these sums. This product 
will be less than the product M, of all the right sides, 


We are already familiar, from the previous chapter, with products 
of the sort R,. When multiplied out, R, is the sum of all the terms 
: ae 1 
oe 3B Op 
where each of «, B,:::, w takes on all values up to 2. In other 
words, R, is the sum of the reciprocals of all the divisors of 

N = 2"-3"- 5" ~~ + pr, 


Therefore R, is a sum of reciprocals of whole numbers, but not of 
all possible whole numbers. The only numbers that occur are those 
made up of the prime factors 2, 3, 5,---+, p, and each of these 


primes appears to no higher than the nth power. For example, 
] fe ] 4 ] 4 ] 3 ] 4 ] 
4: gg 4g 36 


(1 1 ate 1 ay sd 
Agra stay Vata 18 | 36 


aS 
: 1 1 i BE ; 
1 Tt follows from this that 1 + = + = + +++ + — increases without bound as 
n 


n increases (here n is not restricted to be a power of 2, but as it increases it will 
eventually exceed any determined power of 2). In mathematics this is expressed 


3 i 
by saying that the “‘infinite series 1 + “% a = + +++ diverges” even though its 


individual terms continually get smaller and smaller. However, this important 
fact and the idea of convergence and of infinite processes is not involved in the 
present topic. 
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is the sum of the reciprocals of all divisors of 22+ 32 = 36. 

4. We now come to the actual proof. Let m be any positive 
whole number. Then 4A,, is the sum of the reciprocals of all the 
numbers, | 
(3) I; 2 3, 4,-+>,99" 


We consider all the primes that divide all the numbers (3), and call 
the largest of these primes g. Then all the numbers (3) are products 
involving only the prime factors 2, 3, 5,---, g. Furthermore, 
none of these primes can appear to a power higher than the mth 
power. For if the smallest prime 2 appears in a number to a power 
higher than the mth, then that number is larger than 2™ and hence 
does not appear in the series (3). What is true for the prime 2 is 
certainly true for all the larger primes. Therefore the numbers (3) 
are all included among the divisors of 2™-3™-5™-++g™. 

Now if we form the product R, of § 3 not for an arbitrary p but 
for our particular prime g, and with m instead of n, it will include 
all the terms of A,,. The expression R, will contain other terms 
than those of A,,, but the important thing is that all the terms of 
A,, are to be found among those of R,. In connection with the 
example at the end of § 3, we see that all the terms of A,= 
l +5 a = + are included, along with several other terms. 

Because of this relation between A,, and R, we have A,, < R,. 


Furthermore, we have R, < M, by § 3, and 1 eid = A,, from (2). 
Combining these, we Gud 
(4) Fabs se sa 
Now m was an arbitrary positive whole number and g was the 
largest of all the prime factors of all the numbers in (3). The left 
side of (4) can be made arbitrarily large by merely choosing m large 
enough. If there were only a finite number of primes, g could 
increase up to the last prime but no further. Then the right side 
of (4) would increase to a certain value and would then remain 
constant. This would contradict the fact that the left side can be 
made arbitrarily large; therefore it is proved that the primes are 
infinite in number. 

This proof is far more complicated than that of Chapter 1. But 
its importance lies in the fact that the same methods can be used 
for a great many similar but more difficult problems, a few of which 
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were mentioned in Chapter 1. These methods form the basis for 
the theory of the distribution of primes, one of the most extensive 
and difficult fields of modern mathematics. At least in essence, 
they have been handed down to us from the ancients. 


21. Fundamental Principles of Maximum Problems 


We have repeatedly discussed maximum problems. In Chapters 
3, 5, and 6 we exhibited some mathematical miniatures which 
mathematicians of great ability have found time to produce along 
with their more important and lengthier work. In this chapter we — 
shall discuss some principles that are basic to all these problems. 

These principles can be developed by considering an extremely 
simple maximum problem. Let a triangle be given (it is best to think 
of it as cut out of paper). The problem is to find the two points P 
and Q that are as far apart as possible on the surface or its boundary (Fig. 68). 
The answer is easy to guess: P and Q are the ends of the longest side. 
But how can we prove this? 

There is a simple method, a recipe, that we have not used in the 
earlier chapters. It will lead us to the answer here as well as in the 
other problems. We argue as follows: If one of the points, say P, 
lies on the inside of the triangle, then PQ certainly does not have its 


Fig. 68 Fig. 69a Fig. 69b 


maximum length. For on the extension of the line PQ there is a 
point P, that is further from Q than P is, and that is still inside the 
triangle. If both P and Q lie on the boundary of the triangle, but 
one of them, say P, is not a vertex, then we can find a nearby point 
P, on the boundary that is further from Q than the distance PQ, 
That this is so is clear when PQ is perpendicular to the side on 
which P lies (Fig. 69a), as well as when PQ is not perpendicular 
to this side (Fig. 69b). Therefore PQ can be a maximum only if both 
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P and Q are vertices; otherwise it certainly is not. Then PQ is 
a side of the triangle and must naturally be the longest side. 

The same argument will give us the corresponding result for a 
polygon: in order that two points on the surface of a polygon be farthest 
apart, they must be two of the vertices that are farthest apart. In this 
result, the polygon does not need to be convex. In the quadri- 
lateral of Fig. 70 the two vertices at the bottom are the points that 
are farthest apart. 

The maximum problems in the previous chapters can be handled 
in the same way. This principle leads to the result much more 


Ak 


Fig. 70 Fig. 71 


quickly than the methods given in those chapters. For example, 
to find the largest triangle inscribed in a given circle (Fig. 71), 
we suppose that ACB is not equilateral. If AC and BC are unequal 
we inscribe in the circle the isosceles triangle ABD, with base AB. 
The new triangle has a greater altitude than the original, so it has 
a greater area. Therefore a triangle that is not equilateral cannot 
be a maximum; the maximum triangle must be equilateral. The 
same methods also apply in the case of the pedal triangle. 

Why did we use very much longer and more complicated proofs 
in the previous chapters? Why wouldn’t we have been satisfied 
with this much shorter method? The reason is that this procedure 
has a serious logical defect. This defect remained unnoticed for 
two centuries until it was brought out clearly by Weierstrass in the 
second half of the nineteenth century. 


A 
£ 
" P 
D 
B 
Fig. 72 


The defect will become apparent if we apply the principle to 
another example (Fig. 72). Because the plane has infinite extent, 
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it is clear that there are no two points that are farthest apart in this 
figure; the farther the point P moves to the right, the greater becomes 
the distance MP. However, the previous argument can be made 
just as before. Suppose P and Q lie anywhere inside the figure or 
on the boundary, even including the possibility that they may be 
at any of the four vertices A, B, C, D. Unless PQ is exactly the side 
AB, a nearby point P, can be found that increases the distance PQ. 
Just as in the earlier cases, for each pair of points P, Q we can find 
a nearby pair that are further apart in every case except when the 
pair is A, B. No pair other than A, B can give a maximum. If 
we now follow the previous argument strictly, we must conclude that 
AB is the maximum. 

In this last case we have obtained an obviously incorrect result 
by following exactly the same principles. How can we be sure 
that the answer we obtained for the triangle is correct? 

The defect in the method is this: in the case of the triangle we 
proved conclusively that no pair of points, other than the ends of 
the longest side, can possibly be a maximum. But this does not 
tell us that these points are farther apart than any other pair in the 
triangle. If we know that there is a solution of the maximum 
problem, we can logically conclude that it is this pair of points, 
since there is no other possibility. But we must know that there 
is a solution. In the case of Fig. 72, if there were a solution it 
would have to be AB, but here there actually is no maximum. 

it is now clear why we were unable to avoid the apparently 
cumbersome proofs in the earlier chapters. They were not only 
desirable for aesthetic reasons, but they were necessary to avoid a 
serious logical error as well. In order to complete all the topics 
of this chapter we must still supply the lacking proof for Fig. 68. 


v 
\ &® o 


dad 


Fig. 73 


We first make a very elementary observation: if we have two 
parallel lines g and A (Fig. 73), then the perpendicular PQ is at least 
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as short as any other segment UV between a point on one line and a 
point on the other. Also, PQ is shorter than any segment U’V’ 
that joins a point U’ on the left of g with a point V’ on the right 
of h. 

Instead of limiting ourselves to a triangle, we can just as easily 
carry out the proof for a general polygon. 

Let P and Q be any two points inside the polygon or on the 
boundary. We draw the two perpendiculars g, h to the line PQ 
at its ends (Fig. 74). These two lines cut a parallel strip out of the 
whole figure. Now, since at least the point P of the polygon lies 
on g, some vertex A of the polygon must lie to the left of g, or possibly 
on g. Also, some other vertex B must lie to the right of, or possibly 
on, A. The distance AB is not less than PQ, and therefore PQ is 
not greater than the largest of all the distances between any two 
vertices of the polygon. This last distance is therefore a maximum. 

In the following chapter we shall take up a maximum problem 
which is considerably more difficult than any of the previous ones. 
It will be concerned with finding the figure, bounded by lines or 
curves and with a given perimeter, which has the largest area. We 
shall prove that the only possible maximum is a circle, but shall 
not attempt to prove that the circle actually has a greater area than 
any other figure. This problem is complicated by the fact that 
curved figures come into consideration. Because of this we will 
have to use a whole series of new ideas and facts in order to obtain 
even this limited result. 


22. The Figure of Greatest Area with 
a Given Perimeter 


Why do soap bubbles have the shape of a sphere? It is because 
the walls are made of a substance that is subject to cohesive forces 
tending to increase the thickness and decrease the area of the walls. 
The pressure of the air does not come into play, but the enclosed 
air maintains a fixed volume while the area becomes as small as 
possible. The soap bubbles solves the problem of finding the solid 
figure with given volume that has the least area. 

The problem that we shall solve is more modest than the one 
which is solved by every soap bubble and every raindrop. Instead 
of considering solid figures, we shall restrict ourselves to two dimen- 
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sions and ask for the figure of least perimeter with a given area. 

1, Here we have asked for the figure of least perimeter with a 
given area, while the title of this chapter asks for the figure of 
greatest area with a given perimeter. But this difference is only 
apparent. To make this clear, suppose that it is not the circle K 
with given perimeter k (Fig. 75), but the figure F with given perimeter 


Fig. 75 


f that has a maximum area. Then we have f=fk and F> K. 
Now we can contract the figure # proportionally by choosing any 
point O inside of it, drawing lines OA, OB, OC,--- from O to the 
boundary, and then diminishing each of these lines by a fixed ratio. 
We can choose this ratio so that the contracted figure F’, with 
perimeter f’, has the same area as XK. This diminishes the perimeter, 
so we have f’ <k, and therefore F’ is a figure with the same 
area as the circle but a smaller perimeter. If this is proved to 
be impossible, then the problem in the title is also solved and the 
solution is the circle. Since the converse can also be proved in the 
same way, the two formulations are equivalent. 

2. We have implicitly used the following theorem concerning 
the perimeter and area of a figure: 

I. Jf a figure is contracted proportionally around a point O in the ratio 
1:1, then the perimeter is diminished in the same ratio 1:r and the area 
1s diminished in the ratio 1 : 1°. 

We shall use several other theorems concerning perimeters and 
areas of figures. We shall not take up the question of how these 
are proved. This would require an analysis of the exact meanings 
of all the concepts involved, and then we should have to build up a 
complete theory. It is not our purpose to do this, but we shall list 
here all the theorems which we shall use without giving their 
proofs. 

Il. Jf the surface of one figure 1s part of the surface of another, 
then it has a smaller area than the other (Fig. 76). 

The analogous statement for the perimeter of a figure is not true. 

This can be seen from Fig. 77, where we merely need to make 
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the inner curve twist enough so that it is longer than the outer 
curve. However, the theorem becomes correct if we limit it to 
convex figures. 

A figure is called convex if every two points of the figure can be joined 
by a straight line that lies entirely in the figure. A circle and a triangle 


Fig. 76 Fig. 77 Fig. 78 Fig. 79 


are convex figures, while Fig. 78 is not, since P and Q are in it but 
the line joining them is not entirely in the figure. 

III. Jf one convex curve encloses another convex curve, then the enclosing 
curve has the larger perimeter (Fig. 76). 

IV. Every non-convex figure can be rounded off to a convex figure with 
a larger area and smaller perimeter (Fig. 79). 

These four theorems which we shall accept without proof are all 
intuitively self-evident, with the possible exception of III. However, 
we have learned that intuitive evidence is not of much importance. 

3. We are now ready to begin the proof, following the argument 
used by J. Steiner. We first prove that the curve of greatest area with 
a given perimeter must be convex. ‘This follows directly from IV and I. 
If we round the curve off according to IV, we obtain a convex 
figure with smaller perimeter and larger area. If we now expand 
proportionally by the appropriate ratio to obtain the original 
perimeter., the area is again increased, as can be seen from I. We 
have constructed a convex curve with the same perimeter but with 
a greater area. Therefore a curve that is not convex can never 
have maximum area for a given perimeter. 

4. We need only consider convex figures from now on. Next 
we will prove that corresponding to any convex figure there 1s a convex 
figure of the same perimeter, with an area at least as large, and which has an 
axis of symmetry. 

Let P be any point on the curve (Fig. 80), and let Q be the point 
on the curve whose distance from P, measured along the curve, is 
exactly half the perimeter. The chord PQ cuts the figure into two 
parts. If these parts have different areas we choose the larger part 
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and reflect it in the line PQ; if they have equal areas, we choose 
either part and reflect itin PQ. This part and its reflection forma 
figure with the same perimeter as the original and with area at least 


= ll | 


Fig. 80 Fig. 81 


as large. Furthermore, this figure has an axis of symmetry. If 
the figure is not convex, it can be made so by using IV and I, as 
was done in § 3. This will only serve to increase the area, and it is 
easy to see that it will not destroy the symmetry. 

This completes the proof of the assertion, but there is still another 
point to be mentioned. If the original figure is a circle, then PQ 
is a diameter and the figure obtained by reflection is the same 
circle. If the original figure is not a circle, then at least one of the 
two parts into which PQ divides it is not a semicircle (Fig. 81). 
If one part is larger than the other, then the new figure obtained by 
reflection has a larger area. If the two parts have equal areas, let 
us always agree to choose the part that is not a semicircle. Then the 
new figure will not be a circle. If the new figure is not convex, 
we can round it off by means of IV and I, once more increasing 
its area. Therefore we can now say either that there is a figure 
of greater area with the same perimeter, or that the new figure is 
convex and has the same area and perimeter as the original. Further- 
more, the new figure will be a circle only if the original one is 
circular. 

5. If the original figure is not a circle, we can either ‘‘better’’ 
it or replace it with a convex figure with the same area and perimeter, 
a figure which possesses an axis of symmetry but is not a circle. 
If the figure can be “bettered’’, it is certainly not a maximum. 

We will now show that if a convex figure with an axis of symmetry 
is not a circle, we can construct another convex figure with the same 
perimeter and with an area that is definitely larger. This will 
complete the proof, since we will then have proved that every figure 
that is not a circle can be “bettered”’. 

If AB is the axis of symmetry (Fig. 82), there must be some point 
C' on the bounding curve that makes the angle ACB different from a 
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right angle. For if every point formed a right angle, then, according 
to a well-known theorem (see Fig. 18), the curve would be a circle 
with diameter AB, and we are supposing that the curve is not a 
circle. 

We now construct a right triangle A,B,C, (Fig. 83) with legs 
A,C, = AC and B,C, = BC. We reflect this triangle in A,B,, thus 
forming a quadrilateral A,C,B,C;. On the sides of this quadri- 
lateral we place the corresponding segments of the original figure 
that are cut off by ACBC” (the segments that are shaded in Figs. 
82 and 83). 


The perimeter for the new figure remains the same as for the old, 
since it consists of the same four arcs. The area of the new figure 
is made up of the four segments and the quadrilateral A,C,B,C;. 
The segments are the same in both figures, so we need only compare 
the quadrilaterals or even the triangles ABC’ and A,B,C, since the 
figures are symmetrical. The triangle A,B,C, has the area 
4A,C,-B,C,. The altitude of ABC from B is less than BC, since the 
angle at Cis not a right angle. Therefore the area of ABC is less 
than 4AC- BC. Now we have 4,C, = AC and B,C, = BC, so the 
area of ABC is less than that of A,B,C,, and therefore the new figure 
has a larger area than the original. 

The proof is now complete: if we are given any figure that is not 
a circle, we can find another figure with the same perimeter and a 
larger area. ‘The only figure that can possibly be a maximum is 
the circie. 

We have not proved that the circle has a greater area than any 
other curve with the same perimeter. ‘The relation between what 
we have proved and the complete problem has already been discussed 
in the last chapter. The complete solution of this problem can be 
given, but it would necessitate the systematic building-up of an 
extensive theory, and that is beyond the intentions of this book. 
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1. The expansion of a common fraction into a decimal is a 
familiar process. When it is carried out, decimals of quite different 
sorts may arise, as is shown by the following examples: 


i a 0.2 , gy = 0.075 
Il. $= 0.4444---, 4 = 0,142857142857---, 
IT, $= 0.1666-:-,. ge = 0.2333: -- 


The simplest are those of type I. Remembering the meaning 
of a decimal, we can write them as common fractions with their 
denominator powers of 10: 


1 a 75 


5 10° 40 ~=1000 
These equations do not imply anything unusual. They merely 
assert that the given fractions can be ‘“extended”’ so that their 
denominators become powers of 10. That can be done with any 
fraction whose denominator divides some power of 10. The 
characteristic of such a denominator is that it has no prime factors 
other than 2 and 5, since 2%- 5° will divide 10” if y is the larger 
of « and f. Therefore a fraction with the denominator 2% - 5? can 
always be extended to a fraction with denominator 107, and con- 
sequently to a y-place decimal. 

2. If the denominator of a fraction has a factor k which is not 
divisible by 2 or 5, then it cannot be extended to a fraction with a 
power of 10 as the denominator. For the assumption 


1 ——o— 72 4 
ae oe ee 10° 926.59 
would imply 
90-a 50-B — q- k, 


The number kf, which is supposed to be greater than 1, would therefore 
be divisible by 2 or 5 since, because of unique factorization, 2 and 5 
are the only primes dividing a: k and hence k. 

The examples II and III are of this type. In this case we say 
that the fractions §, 7, ¢, and 34> can be expanded into “infinite 
decimal fractions.” These are not, properly, decimal fractions in 
the sense of having a denominator that is a power of 10. There 
is no last digit in the decimal expansicn, so there is no suitable power 
of 10. 
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When we speak of an “infinite decimal fraction” we are using 
the word ‘“‘decimal fraction” in a new and extended sense. For 
our purposes there is no point in going into the exact meaning of 
this new sense. All that concerns us is the formal side of the 
question, in this case the process of expanding the fraction into 
a decimal does not break off. It unendingly produces more and 
more digits of the infinite decimal. More than this, the infinite 
decimals will be periodic. That is, from a certain point on, the 
sequence of digits will consist of the mere repetition, over and over 
again, of the same group of digits. For example, the expansion of 
2 consists, from the very start, of the repetition of 142857. Since 
there are only 10 digits, some digit must appear repeatedly in any 
infinite decimal fraction, but this does not mean that all such 
decimals are periodic. An example ofa non-periodic infinite decimal 
fraction is the decimal 


0.101001000100001 ..., 


where only the digits 0 and 1 are used, and where the nth 1 is 
followed by exactly n digits 0. 

3. Examples of fractions in which the denominator has a prime 
factor other than 2 and 5 are given in II and III. We shall now 
show that such fractions always lead to periodic decimal expansions. 
First considering the example 4, we obtain the decimal expansion 
by the usual method of long division: 


0.142857... 


7) 1.000000 
7 
30 
28 
20 


As soon as the remainder 1 has appeared, the whole process of 
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division starts over again right from the beginning. The sequence 
of remainders 1, 3, 2, 6, 4, 5 and the sequence of quotients 1, 4, 
2, 8, 5, 7 will repeat themselves over and over. The decimal 
fraction will have the period 142857. Since we will want to carry 
out the divisions for other examples, it will be convenient to represent 
the process in a more compact way. We shall write 


1 
oe 0.142857 --- 


1326461 


where we have written the remainder under the newly-found 
quotient at each step. The period of the infinite decimal fraction 
will always be distinguished by a bar over it. 

Another example is 


3 
—_ — 07 | Sf or 
3 


4] 3 30 18 7 29 


Here the division process starts to repeat when the remainder 
3 appears, since the whole process began with 3. Some of the 
digits in the quotient can repeat before the end of the period; it 
is not this but the repetition of a digit in the remainder that causes 
the process to begin over again and the period to end. 


. ete te | : a 
In carrying out the division for any fraction 3 must eventually 


find a remainder that has already appeared before. (The numerator 
of the fraction is considered as the first remainder.) The only 
remainders that are available are 1, 2, 3,---,5 — 1. The remainder 
0 is not included, since its appearance would mean that the division 
comes out even and the decimal is finite. But we have seen that 
this is impossible if the denominator has a prime factor other than 2 
and 5. Now, since there are only 6 — 1 remainders available, the 
division process must start over anew at the bth step or earlier. The 


decimal expansion of 538 periodic and its period has at most 


b—1 places. For + this maximum length is reached, for 3 it 
is not reached. A further example with period of maximum 
length is 

l Sic aeagie 

—= 0.0588235294117647-:-- 

17 1 1015144695 16 728 1311 812 1 
with period of length b — 1 = 17—1 = 16. 

4. From now on we shall consider fractions whose denom- 

inators are relatively prime to 10, that is, those which do not have 
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the prime factors 2 or 5. We will then be able to say considerably 
more about the length of the period. 


The fraction — will naturally be a reduced fraction; a and 6b are 


relatively prime. Then the remainders that appear in the division 
will all be prime to 4. For suppose 7 is a remainder that is prime 
to b. The next step of the division consists of dividing 10r by 5 
to obtain a quotient g and a remainder 7, That is, 5 goes into 
10r just g times and it leaves over the remainder 1,, 
10r = gb +17, 

or 

No prime divisor of ) can divide either 10 or r. Therefore none 
divides 10r and none divides 107 —gb. Consequently every 
remainder r that is prime to 3) is followed by another remainder 7, 
that is also prime to 6. Since the numerator a is counted as the 
first remainder and it is prime to J, so will all the remainders be 


prime to 6. Therefore the period of © can be no longer than the number 
of remainders that are prime to b. : 

The number of remainders prime to 6 is a number which is of 
interest for its own sake. In the theory of numbers it is usually 
designated by the symbol g(d), and we have, for example, y(2)=1, 
p(3) = 2, p(4) = 2, p(5) = 4, (6) = 2. (7) = 6, (8) =e, 
Since for a prime #, all the — 1 smaller numbers are prime to it, 
we have | 


(2) g(p) =p—1. 


Using the symbol y(d), we can now say that the period of ; has 
at most (bd) places if b is prime to 10. ‘ 
5. We have seen that the decimal expansion for a fraction = 


must be periodic if 6 has a prime factor different from 2 and 5, 
even when 6 is not prime to 10. We saw that some one of the 
finite number of possible remainders must eventually be repeated. 
Therefore we knew that there must be a period, but we could not 
be sure where the period would begin. In the example II, the 
period begins immediately after the decimal point, while in III it 
begins only after having passed a digit. Now the denominators of 
II were chosen to have no factor in common with 10, and we will 


show that the period of the decimal expansion of - will always 


150 


PERIODIC DECIMAL FRACTIONS 


begin immediately after the decimal point if ) is prime to 10. To 
do this we will have to show that the first remainder that is repeated 
is the first in the whole series, the numerator itself. If two re- 
mainders are equal, 7,, = 7,, then the two preceding remainders 
Tm-1 and r,_, are also equal, if there actually are any preceding 
remainders. For r,, and 7, have arisen from the division of 107,,_, 
and 107;_,, 

10rn—1 ie Ym—-1 : b + Tm 

oe Ae ee oe 


Using r,, = 7,, we find by subtraction 


10(Tina ees Trt) =(Ym—1 ie Yn-1)9, 
and therefore b divides 10(r,,_, — 7,1). Since ) is prime to 10 it 
must divide r,,_1; — 1n-1, 80 this difference must be one of the num- 
bers, 
ot, + 30,:--. 
Now, however, 7,,_,; and r,_, are remainders, so each is less than 
b, and this difference is then numerically less than 5. The only 
possibility is that 
land 7? 74 0, 
Re te Te , 


Therefore the period begins as early as possible, immediately after 
the decimal point. 
6. Ifwe designate the length of the period by A, then in developing 


a. ‘ : : ; 
—in a decimal fraction we come to the first remainder a again 


b 
after A steps of the division process. At each step of the division 
we have moved one more place to the right of the decimal. The 
remainder a that appears after A steps really represents TT There- 
fore if we divide 10a by b the remainder is a, and consequently 
104a — a is divisible by b. Since we have 


1044 — a = a(104 — 1), 


and since a is prime to 4, this means that b divides 104 — 1. Since 
the period ends at the first repetition of the remainder a, A is the 
smallest number for which 104 — 1 is divisible by b, that is, the 
smallest number for which a- 104 has the remainder a on division 
by 3. 

We have proved 
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Theorem 1. The length A of the period of —1s the smallest number A 
for which 104 — 1 is divisible by b. 

Two facts that are implicitly included in this theorem deserve 
to be emphasized. 7st, to every number 6 that is relatively prime 
to 10, there corresponds a number A such that 104 — 1 is divisible 
by 6. The existence of sucha A is not self-evident, but we have 
obtained it as the length of the period and have proved the existence 
of a period in § 3. Second, the number A is determined by 6 and is 


; ete : 
independent of a. All reduced fractions 5 with the same denominator 


b have periods of the same length. We shall emphasize the de- 
pendence of 4 on 6 by writing 2 = A(d). 

7. We shall now consider the decimal expansions of — for 
different values of a but the same 6. We have already found 


] 
— = 0.142857:--- 


13264651 

and similarly obtain 

= == O.286714 «<= 

i 2 645132 
The period 285714 belonging to # can be obtained from the period 
142857 of 4 by a cyclic reordering. This is clear if we notice 
that the remainder 2, with which the development of # begins, 
appears also among the remainders for 7. From this point on 
the two developments must agree step for step. The remainders 
3, 4, 5, 6 also arise in the development of +. In fact all the 6 
remainders for 7 must arise since the development of 4 has a period 
of length 6. If we think of the digits of the period as a cycle in 
which the last digit of the cycle is supposed to be followed by the 
first digit of the period, then we can arrange the remainders and 
quotients in a table: 


remainders leo 2 Ab cats 

quotients | 4.2 Se 
From the table we can read off the decimal expansion of $ or 
any other similar fraction. For example, the period of $ begins 


with 8 and is therefore 857142, so we have $ = 0.857142. 

The periods that we have been speaking of are periods of quotients. 
As we have seen, however, the remainders also exhibit periods of 
the same sort and the same length. 
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8. The periods of : cannot exceed g(d) in length, but it is not 


necessary that they actually be this long. For % and 4, we found 
3 


periods of this maximum length, but for = we found 4 = 5, which 
is less than g(41) = 40. The actual length of the period is dif- 
ficult to determine in advance and it must be found by computation 
in each case. But we can say more about its length 4(b) than 
the single result A(b) S (0). 

We shall use the new example ;5 as a starting point for our 
discussion. By testing the 20 numbers less than 21 we easily find 


g(21) = 12. However, by division we obtain 


] eee 
—= 0.047619::-, 
2 1 101613 419 1 
with period of length 4(21) = 6 < 12 = ¢(21). In the division 
process only 6 of the 12 possible remainders arise and we arrange 
them in a table: 


(A) remainders ir=30'. 16213: 4: 19 
quotients peg PG 
From this table we can read 

10 ——_— 4 —— 

— = 0.476190---, — = 0.190476---, 

21 21 


and others, but we cannot find the expansion of #7. This fraction 
requires a new division, 


D) Sa eae 
— — 0.095238°:;, 
21 


2 2011 5&5 8 17 2 
from which we obtain a second table: 


remainders 2 ee) oR ee 


(B) ee ee ee ee 
quotients So: shied: 2cave & 


Not only does the one new remainder 2 appear in this table, but 
all of the other remainders are new as well. It might have been 
seen before that none of the old remainders would appear. 
Each remainder completely determines the whole further course of 
the development, so every remainder of table (A) carries with it 
the entire 6-digit period of remainders of (A). Then, if (B) con- 
tained any remainder which appears in (A), it would contain all 
of (A). Since (B) has only 6 remainders, it could not contain any 
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others; but this does not agree with the fact that (B) contains the 
remainder 2. ‘The two tables (A) and (B) together now contain 
all the y(21) = 12 remainders that can be numerators of reduced 
proper fractions with denominator 21. 

9. The ideas involved in the construction of tables (A) and (B) 
for the denominator 21 can be used to obtain some general results 
concerning the length of the period. 


] 
If the development of F has the maximum length (5) = (bd), 


there is just one table, as in the case of 4 in § 7. 
However, if we find 4(b) < y(b), as for 6 = 21, then only A(d) 


1 ; 
remainders occur in the development of ; Using these we form 


a table (A) which clearly does not contain all the (b) possible 
remainders. We choose a remainder r which is prime to 6 and does 


not appear in (A), and develop - in a decimal fraction whose period 


will also have the length 4(5), according to § 6. We use the new 
remainders and quotients to form a second table (B). The 
remainders in (B) will all be different from those in (A), since (B) 
contains r and any remainder of (A) would carry with it all 4(b) 
remainders of (A). 

The tables (A) and (B) together contain 24(b) different remainders, 
all prime to }. Either this number represents all the possible 
remainders, in which case we have 24(b) = (6), or there are still 
other remainders. If s is a remainder which is not in (A) or (B), 


we develop — and obtain another table (C), containing 4(b) new 


remainders which are neither in (A) nor (B). In all, we would now 
have 3A(b) different remainders, prime to 6. If 34(b) is equal to 
y(b), all the possible remainders have been exhausted. Otherwise 
we repeat the process, forming new tables until all (5) possible 
remainders are used. The important thing is that each time a new 
remainder appears, there are (A — 1) other new ones to go along 
with it. 

This tabulating will finally result in a number, say k, of tables 
which contain all y(b) remainders prime to 6. Each table contains 
4(b) remainders and no remainder occurs twice. Therefore 
(3) y(6) =k ad), 


which proves the theorem: 
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The length A(b) of the period is a dwisor} of (bd). 
We have already found 9(p) = p — 1 if p is prime. Therefore 
we have the special result that the length A(p) of the period for the 


a 
fraction > is a divisor ofp — lifpisaprime. A number of examples 


of this may be found above, where we found 4(3) = 1, 4(7) = 6, 
Atta) = 26, A(41) = 5. 

As one more example of the method, we will consider the distribu- 
tion of the remainders among the various periods for the denominator 
39. We start with 


(A) Ly asa - - -, 

3 1 1022251641 
from which the table of remainders and quotients can easily be 
constructed. The question then arises whether the table contains 
all the remainders that are prime to 39. Evidently the remainder 2 
is missing, so we use it for the construction of the next table: 


2 ence! 
(B) — — 0,051282---. 

39 2 205113282 
This yields 6 new remainders. The smallest remainder which is 
prime to 39 and which is not in (A) or (B) is 7. This allows us 
to find a new table: 

: 7 So 

(C) — = 0,179487- >>. 

39 7 313719 34287 
Now 14 is the smallest remainder prime to 39 that is not included 
among the 18 remainders of (A), (B), and (C), and we use it to 
continue our process: 

14 —_—__— 
(D) — = 0.358974::-. 
39 14 23 35 38 29 17 14 

In all we now have 24 different remainders, all prime to 39. These 
finally exhaust all the possible remainders. In fact, among the 
numbers from 1 to 39 the multiples of 3 and 13 will have a common 
factor with 39. The multiples of 3 account for one-third of all 
these numbers, 13 in all, and the multiples of 13 account for 2 more, 
the numbers 13 and 26 (39 has already been counted as a multiple 
of 3). In all, 15 of the numbers from 1 to 39 have a common 
factor with 39, and the remaining 39 — 15 = 24 are prime to 39. 
Therefore we have o(39) = 24, which is in agreement with 1(39) =6 


1 The improper divisor g(b) itself is not to be excluded. 
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Furthermore, since y(39) = 4:A(39), there are 4 tables, (A), 
(B), (C), and (D). 

10. Our results concerning the length of the period can be used 
to obtain an important general theorem. We need the following 
simple lemma: 

If x and k are positive whole numbers, then x* — 1 is divisible 
by x —1. 

This lemma follows directly from the formula 

(ltxtxt?tee+ + x81) (x — 1) = x* — 1, 
which can be obtained from the formula for the sum of a geometric 
series, or by direct multiplication. If we use the value x = 1040), 
then the lemma asserts that 10*) — 1 divides 10 —1. We 
choose for k the particular value occurring in (3), and then we have 
19*4@) _ 1 — 10°) — J, 


Therefore 104) — 1 is a divisor of 10?) —1. But according to 
§ 6, b divides 104 — 1, and therefore 5 is a divisor of 10? — 1. 
We state this in the theorem: 

If b is prime to 10, then 10?) — 1 is divisible by b. 

This theorem no longer has anything to do with decimal fractions, 
since y(b) has a completely independent meaning. Equation (2) 
gives us the- special case: 

If p is a prime that does not divide 10, then 10?-1 — 1 is divisible by p. 

The presence of the number 10 in these theorems is not essential. 
It occurs only because of the fact that our ordinary system of 
writing numbers is based on the number 10. If we think of using 
a number system based on any other number g, we can talk about 
“‘g-adic”’ instead of decimal fractions. Our whole discussion can 
be repeated without change, and we obtain the general theorems: 

Theorem II. If b ts prime to g, then g?) — 1 is divisible by b. 

Theorem III. If p is a prime that does not divide g, then g?-1 — 1 
as divisible by p. 

Here we have found a theorem that extends far beyond the special 
topic of decimal fractions. It is a fundamental theorem of the theory 
of numbers. The special case III is called Fermat’s theorem after its 
discoverer; theorem II is Euler’s generalization of Fermat’s theorem. 

A few examples will illustrate these theorems: 


p= 95d: 
gee et) ee 1G ee SS, 
gua se a = 16 oO, 
4°-1 —_ ] = 255 = 51:5, 
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p= 
g?-1 | = 63 = 9+ 7, 
g-1_1=— 728= 104-7, 
Sl — l= 15624— 2232-7, 
107-1 — 1 = 999999 = 142857-7 


b= 6, o( 6) = 2: 
Oe a 04 4G, 
Meo? — 46. § - 6. 

b= 9, 9(9) = 6: 

Seay gp 9, 
46 1=— 4095 = 455-9, 
56 — 1 = 15624 = 1736-9, 

b= 10, (10) = 4: 
ee. 20.4, &.10, 
ced << JA00 =. 240 «10, 
9*— 1 6560 = 656: 10. 


11. After this digression, we now turn back to the decimal 
fractions. We have seen that the development of a reduced 
fraction — with denominator } prime to 10 leads to a decimal 
fraction whose period begins immediately after the decimal point. 

If such a decimal fraction is given, we would like to be able to 
find the common fraction from which it came. Let the decimal 
fraction have the period P of length 4. We will think of P as an 


1 eens 
ordinary number of A digits. For example, for es 0.142857 - «+, 


we have P = 142857 (one hundred forty-two thousand eight 
hundred fifty-seven). If the decimal fraction is the development 


of - then the remainder a will again appear after A steps of the 


division of a by b. The part of the quotient obtained up to this 
appearance of a is exactly the period. Therefore a- 104 — a is 
divisible by ), and we have 

a(10* — 1) = BP, 
or 


PERIODIC DECIMAL FRACTIONS 
The fraction on the right will not usually be in reduced form, but 
on reduction it will give us the desired fraction a Since 2 and 5 


divide 10, they do not divide 104 — 1, and they cannot divide 3, 
which is a divisor of 10*— 1. This shows that every periodic 
decimal fraction whose period begins immediately after the decimal 


: : a, : 
point must have come from a common fraction — in which the 


denominator is prime to 10. We already know that the converse 
is true. A decimal fraction in which the period begins immediately 
after the decimal point is called a “purely periodic” decimal fraction. 
12. We have been considering only purely periodic decimal 
fractions. However, the examples under III in § 1 are not purely 
periodic. Such decimal fractions, in which one or more digits 
occur between the decimal point and the period, are called ‘“‘mixed 
periodic.” Since finite decimal fractions belong to fractions whose 
denominators have only the prime factors 2 and 5, and purely periodic 
ones belong to fractions whose denominators contain neither 2 nor 
5, all that is left for the mixed periodic decimals is to belong to 
fractions whose denominators have a factor in common with 10 and 
a factor that is prime to 10. We merely mention this third case; 
it has nothing to offer that would repay our studying it. 
13. After this discussion of the fundamental properties of periodic 
decimal fractions, we shall conclude this chapter with a consideration 
of a property which is more amusing than significant. The period 


1 ; : 
of - consists of the 6 digits 142857. We split them in half and 
add the numbers so formed: 
142 + 857 — 999. 


] 
The period of — is 0588235294117647 which, when split and 
added, gives 

05882352 + 94117647 = 99999999. 

] 
For the period of > which is 09, we have: 
0+ 9 = 9. 

We will show that the sum of the two halves of the period will always 


; ee 
turn out this way when the period belongs to a fraction — whose 
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denominator is a prime, provided that the period has an even 
number of digits. 

If the length 4 of the period is even, we can write 4 = 21. Also, 
if the two halves of the period P are A and B, then P is a number 
of A digits, while A and B have / digits each. Remembering the 
significance of the position of a digit in a number, we have 


P= A~10': +> &; 


Now we know that the fraction “ can be found from the period P 
by means of (4): 


(5) a P DD a 
p 10—1 104 — 1 
Since 2 = 2/1, we have 
(6) 104 — 1 = 10% — 1 = (10 — 1)(10' 4 1). 


Equation (5) shows that the denominator p can be extended to 
104 — 1. This means that p divides 104 — 1, a fact that we could 
have found from § 6. If p divides 104 — 1 = (10'—1)(10' + 1), it 
must divide at least one of the two factors, because it is a prime. 
Now # cannot divide 10' — 1, because / is less than A and A is the 
smallest number for which 104 — 1 is divisible by p (theorem I, 
§ 6). Consequently p must divide 104 -4+1. From (5) and (6) 
we have 

a A:-10'+ 8B 

p (10! — 1)(10* + 1)’ 
and this can be rewritten as 

a(10?+1) A-10'+B 

p - 19 — 1... 


The left side is a whole number because p divides 10'-++ 1. There- 
fore the right side is also a whole number. Now we also have 
A-104+B A(l0‘1—1)+ A+B ,, A+B 


A 
10! — 1 i ee | age ky 


and, since A is a whole number, so is 


(7) Se see 


10¢}— 1 
Now A consists of / digits and is greatest when all these digits are 
9’s. The number consisting of J digits 9 is 10 — 1, and therefore 


h. 
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we have A < 10'— 1. In the same way we find B < 10! — 1, 
and hence 
Ap BS 2 410d): 


The equal sign cannot hold in this inequality, for A+B=2- (10'—1) 
implies that both A and B have the value 10'’— 1. But then 4 
and B would contain only 9’s and P would be a sequence of 2/ 
equal digits, 9. This is an absurdity, since the period would then 
consist of the one digit 9, while we have assumed that it has an even 
number of digits. ‘Therefore we have 


(8) A+B < 2-(10* = 9), 
Now (7) can be written as 
(9) A+ B=h(10' — 1), 


where h is a positive whole number. According to (8), h is less 
than 2, so it must be 1. Then (9) becomes 


A+B=10'= 1, 
that is, A + B is the number of / digits 99--- 9. 


24. A Characteristic Property of the Circle 


When it rains, the ground is wet; when the ground is wet, it is 
not necessarily raining. This example is frequently used to elucidate 
the difference between a theorem and its converse. Clear as it 
may be in this formulation, it is very badly confused in ordinary 
life. Intelligent persons to whom the difference is crystal clear 
when it is brought to their attention are prone to mix it up un- 
consciously in ordinary intercourse. A political orator can often 
take a statement of his opponent and make it sound ridiculous by 
stating it in its converse form, without the trick’s being noticed by 
his listeners. Every mathematician knows that in teaching he must 
systematically educate the beginning student to avoid committing 
an error by the unconscious use of an unproved converse. 

But the conscious passage from a theorem to its converse is a 
useful and fruitful principle for the researcher in mathematics. 
This and the following chapter, which are otherwise independent, 
will show how this principle serves to lead from known theorems 
to new theorems and whole new concepts. 

We begin with a simple example of a mathematical theorem and its 


160 


A CHARACTERISTIC PROPERTY OF THE CIRCLE 


converse. The theorem that all angles inscribed in a circle and 
subtended by the same chord are equal } (Fig. 84), is familiar from 
elementary geometry. Now the main point is that the converse of 
this theorem is also true: The locus of all points from which a given 
segment AB subtends equal angles is a circle. In elementary 
geometry the importance of this theorem and its converse is not 
brought out clearly. Since both the theorem and converse are true, 
this property of the inscribed angle is a characteristic of the circle. 
It could be used to replace the ordinary definition, that a circle is 
the locus of all points that are at the same distance from a fixed 
point. In fact, all of the most interesting theorems concerning 
circles are proved only after this theorem, and they depend on it 
more directly than on the definition of the circle. 

Following these preliminary remarks, we turn to another property 


C 


Fig. 84 Fig. 85 


of the circle and show that it is a characteristic one. In elementary 
geometry an angle is defined (for good reasons, to be sure) as the 
amount of turning between two straight lines. However, there is no 
reason for us not to consider the amount of turning between two 
curves. If desired, it may be defined as the angle between the 
tangents to the two curves at the vertex (Fig. 85). 

The circle has the obvious property that the chord joining any 
two points on it meets the circle at the same angle at the two points. 
These angles are angles between a line and a curve. We now form 
the converse and ask: Jf a curve has the property that every chord joining 
every two points on it meets the curve at the same angle at the two points, 
is the curve always a circle, or are there other curves with this same property? 
(Fig. 86). We shall show that it is always a circle, and hence that 
this is a characteristic property of the circle. 

Let us suppose that we have a curve with this property, and 


1 If D lies on the lower arc AB, then the angle is to be measured between the 
extension of the ray AD through D, and the ray DB (otherwise supplementary 
angles must be considered). 
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that A, B, Care any three points on the curve (Fig. 87 ). We draw 
the three chords joining these points and the three tangents to the 


Fig. 86 Fig. 87 


curve at these points. Then, in view of the assumed property, 
the angles designated by the same letters in the figure are equal. 

The three angles at A form a straight angle, and the same is true 
of the angles at B and C. Therefore we have 


c+P+y =a 
a+b+ y= 2R, 
a+ f+c= 2R, 


where & represents a right angle. Adding these equations, we find 
(ae ey fe + Py) = GR. 
Now, since the sum of the three angles of a triangle is 2R, we have 
a+b+c= 2R, 


and therefore 
2(n-+ B+ y) = 4R, 
a+ B+ y= 2R. 
Comparing this with 
a+ P+ y= 2R, 
we obtain a=a. In the same way we find 6 = f, ¢c = y. 

Now let D be any other point on our curve. We can draw the 
triangle ABD and treat it exactly as we did the triangle ABC. 
Since the points A and B are the same in both triangles, the chord 
AB and the tangents at A and B are also the same. Because of 
this, the angle y is the same in both figures. Since the angle d 
at D is equal to y, and y = ¢, we have d = ¢ for every point on the 
curve. Therefore the chord AB subtends equal angles at all points 
D on the curve. By the converse to the theorem on inscribed 
angles, this implies that D lies on the circle determined by the points 
A, B, C. Since D was any point on the curve, the curve must be a 
circle. 
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In contrast to this result, the next chapter will take up a property 
of the circle which is not characteristic. We shall find that a whole 
series of curves has this property, and we will see how the principle 
of the converse can lead to new concepts, in this case a remarkable 
class of curves. 


25. Curves of Constant Breadth 


1. A circle is defined as the curve all of whose points lie at a given 
distance from a fixed point, the center. The wheel is a direct 
practical application of this property of the circle. The hub of the 
wheel is held at a fixed height above the ground by the spokes of 
equal length, thus maintaining a smooth horizontal motion. In 
moving very heavy loads, the wheel and axle is sometimes not 
sufficiently strong. In this case one often resorts to the more primi- 
tive use of rollers. The load is merely rolled along over cylindrical 
rollers (Fig. 88) which are continually placed under the front. The 
load moves horizontally over these cylinders, whose cross sections 
are circles. 

Obviously a wheel must be made in the form of a circle with the 
hub at the center, since any other form will produce an up-and- 
down mction. However, strange to say, it is not necessary that 
rollers have a circular cross section in order to perform their services 
properly. For rollers the center is no longer important. The 
property of the circle that allows it to be used for rollers is that 
every pair of parallel tangents is always at the same distance apart, 
no matter how the circle is turned. The circle has the same width 


. A 
I. ( 
B 
Fig. 88 Fig. 89 
in every direction; it is what is called a “‘curve of constant breadth”’. 


One might expect this property of the circle to be completely 
characteristic, as were the properties discussed in the last chapter. 
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But surprisingly, there are other curves with this same property. 
Indeed, there is a great multiplicity of curves of constant breadth 
which are not circles. 

2. If we wish to determine the breadth of some curve C in a 
given direction, we can project each point of the curve perpendicularly 
on a line parallel to that direction (Fig. 89). The projection will 
fill up some segment AB of the line, and the length of this segment 
will be the breadth of the curve in the given direction. 

_ The two lines of projection that are perpendicular to AB at A 
and B have at least one point in common with the curve C, while 
the entire curve lies on only one side of the line. We shall call a 
line with this property a ‘supporting line’ of the curve. 

A closed curve has exactly two supporting lines in each direction. 
They can be found by the method of Fig. 89, or we can draw two 
parallel lines in the given direction, containing the curve between 
them, and then slide them together until they just touch the curve 
(Fig. 90). 

A supporting line is not the same as a tangent line. In Fig. 
9la the line ¢ is a tangent at 7, but is not a supporting line. In 
Fig. 91b s is a supporting line but is not tangent to the curve. 

For a curve of constant breadth, the distance between every pair 
of supporting lines is a fixed amount b. If we draw two pairs of 


t 
‘EN . | se 
Fig. 90 Fig. 9la Fig. 91b 


supporting lines to such a curve, the parallelogram which they form 
will be a rhombus (Fig. 92). If the pairs of supporting lines are 
perpendicular, the rhombus is a square of side 5. Therefore all 
squares circumscribed about a curve of constant breadth are con- 
gruent. ‘This can be nicely illustrated by cutting a piece of heavy 
cardboard into the shape of a figure of constant breadth and cutting 
a square hole in another piece of cardboard. If the square has sides 
equal to the breadth of the curve, then it will fit the curve no matter 
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what the direction in which it is turned. The curved figure can be 
turned freely inside the square without ever having any room to 
spare. Both this and its converse are true: A curve of constant 
breadth can be rotated inside a square without any space to spare, 


Fig. 92 Fig. 93 


and a curve that can be rotated inside a square is a curve of constant 
breadth. 

3. The simplest curve of constant breadth that is not a circle is 
the curvilinear triangle pictured in Fig. 93. The three sides are 
equal ares of circles, and the center of each arc is the opposite corner. 
The three arcs have equal radii which are equal to the constant 
breadth 6 of the curve. Of any two parallel supporting lines, one 
must touch at a corner and the other be tangent to the opposite 
side, or else both must touch at corners. In the first case, the distance 
between the supporting lines is clearly b. In the second case, 
each supporting line is tangent to the arc opposite the other corner, 
and their distance is again b. 

This curvilinear triangle was first discovered, in the sense of a 
curve of constant breadth, by the technologist Reuleaux. He 
proved kinematically that this curve can be rotated inside a square 
without any space to spare. We have just seen that this property 
is characteristic of the curves of constant breadth. 

4. The principle used to construct the Reuleaux triangle can be 
extended to figures with more sides. The essential idea is to draw 
a series of arcs of equal radii in such a way that the center of each 
arc is the opposite corner. We can start with any point B for the 
first corner and draw an arc with radius 6 and B as center. On 
this arc we choose two points A and C' to be new corners. The 
arc of radius 6 with center C goes through B, since BC' = 6 by the 
previous construction. On this arc we choose another corner D. 
The arc of radius b, with D as center, goes through C. If we wish 
to end this process, we choose the corner £ on this arc so that it is 
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also on an arc of radius 6 with center A. That is, E is the inter- 
section of these two arcs. Finally, we join A and D by an arc with 


D A E 
D 
A 
B 
B . c 
4 ‘ad 
Fig. 94a Fig. 94b 


center # and obtain a curvilinear pentagon ADBEC of constant 
breadth (Fig. 94a). Curvilinear polygons of more sides can be 
constructed in the same way by delaying the step at which the curve 
is closed. Fig. 94b is such a polygon of 7 sides. Since each corner 
is opposite a side which is an arc of radius 6 having the corner as 
its center, it immediately follows that this construction produces a 
curve of constant breadth 6. For a later purpose we have joined 
each corner with the two ends of the opposite arc by means of radii. 
These radii form a self-intersecting polygon, all of whose sides are 
equal. The angle formed by each pair of radii through a corner 
is the central angle of the opposite arc. 

All the curvilinear polygons constructed by this method will have 
an odd number of sides. To see this, we mark a corner and its 
opposite side. If we now pass around the curve starting at the 
marked corner, we will first pass a side, then a corner, and so on 
alternately until we pass a corner just before reaching the marked 
side. In all, we will have passed the same number, say n, of sides 
and corners in going from the marked corner to the marked side. 
Now if we start at the marked corner again and pass around the 
curve in the opposite direction, we will again pass n sides and n 
corners before reaching the marked side, since opposite each corner 
on the first path there is a side on the second path, and opposite 
each side there is a corner. Counting in the marked parts, there 
are then 2n + 1 corners and the same number of sides. 

5. The curves that we have constructed all have corners, that is, 
points where two sides meet at an angle. However, we can use 
these curves to obtain new curves of constant breadth that do not 
have any corners. Starting with one of our curves, we draw a 
curve parallel to it and outside it at a fixed distance d (Figs. 95a, 
b, c). This is easily done with the aid of the diagonal polygons 
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which we have drawn inside the curves. We merely replace each 
arc of the original curve by an arc having the same center, but 


Oe Os 


Fig. 95a Fig. 95b Fig. 95c 


with the radius increased by d. The corners of the original figure 
are considered as arcs of radius 0, so they are replaced by arcs of 
radius d. ‘The resulting figure is made up of an odd number of 
arcs of one radius and the same number of arcs of another radius. 
The arcs pair off, an arc of one radius being paired with one of the 
other radius having the same center (a corner of the original curve). 

The same principle can be used to construct curvilinear polygons 
of constant breadth with arcs having radii of more than two different 
sizes. The opposite arcs are arranged so that they have the same 
center and so that their central angles form vertical angles (Fig. 96). 

These methods allow us to construct an unlimited number of 
curves of constant breadth. However, these curves all have the 


Fig. 96 


special feature that they are formed of a number of circular arcs. 
In order to prevent a misunderstanding, we wish to emphasize that 
there are curves of constant breadth for which no part of the curve, 
no matter how small, is a circular arc. 

6. Now that we have seen some examples of curves of constant 
breadth, we shall consider some of their general properties. In all 
of our examples the curves are convex curves, that is, curves which 
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have only two points in common with any line that cuts them. In 
order to simplify the discussion, we shall restrict our considerations 
to convex curves, and we shall always mean such curves even if we 
don’t explicitly state that they are convex. As a matter of fact, 
this is really no restriction at all. It can be proved that every 
curve of constant breadth is convex. However, the proof would 
carry us too far afield, and we can avoid it by making the restriction. 

Carefully defined, a convex curve is the boundary of a convex 
region. A convex region is characterized by the property that every 
two points of it can be joined by a straight line segment that is 
entirely in the region. Examples of convex regions are: a square, 
a circle, a triangle, an ellipse, and all the figures of constant breadth 
that we have mentioned. It is clear that a supporting line of a 
convex region will either have just one point or a whole segment in 
common with the boundary of the region. However, we shall prove 
the theorem: 

Theorem I. A curve of constant breadth has just one point in common 
with each of its supporting lines. 

Before proving this we make a simple observation: : 

Theorem Il. The distance between any two points on a curve of constant 
breadth b 1s at most equal to b. 

For if P and Q are two points on the curve (Fig. 97), then the 
two supporting lines perpendicular to the segment PQ must contain 
PQ between them. Therefore the distance between these lines is at 
least as large as the distance PQ. Since the distance between the 
supporting lines is 5, the result is proved. 

Turning to the proof of theorem I, we assume that it is false, that 


B s 
y B 
gs! 
ra 
Fig. 97 Fig. 98 


two points P, and P, of the curve lie on the supporting line s (Fig. 
98). We draw the supporting line s’ parallel to s on the other side 
of the curve and let Q be a point of contact of s’ and the curve. 
The distance between s and 5s’ is again b. 

The segments P,Q and P,Q cannot both be perpendicular to s, 
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since the triangle P,QP, cannot have two right angles. Consequently 
one of the segments is longer than 8, but this contradicts theorem II. 
Therefore the assumption of two points of the curve on the supporting 
line is disproved and we have theorem I. 

If we again use the fact that the perpendicular line joining two 
supporting lines has length ) while any other joining line is longer, 
we immediately obtain the theorem: 

Theorem III. Jf a line joins the two points of contact of two parallel 
supporting lines of a curve of constant breadth, then it 1s perpendicular to the 
supporting lines. 

7. If we draw a circle of radius ) about any point of the curve 
of constant breadth b as center, then, by II, the whole curve will be 
enclosed by the circle. We shall show that the curve cannot lie 
wholly in the interior of the circle, but that it must have at least 
one point on the circumference. 

Let P be any point on the curve C of constant breadth 6. With 
P as center we draw a circle K, (Fig. 99) which is large enough to 
enclose C but small enough to have a point Q of C on its circum- 
ference. The radius r of K, is at most equal to J, since the circle K 


t 
A; 


Fig. 99 


of radius 6 and center P encloses C. Therefore K, is inside or at 
most identical with K. 

The tangent ¢ to the circle K, at Q goes through the point Q of C. 
Furthermore, C is enclosed by K; so it lies all on one side of ¢. There- 
fore ¢ is a supporting line of C. The supporting line s, parallel to 
t on the other side of C, is at the distance 5 from ¢ because C is of 
constant breadth 6. According to theorem III, the point of contact 
P, of s is on the perpendicular to ¢ through Q. If r=, then 
P, falls on P; if r <b, then P lies between Q and P,. But the 
latter is impossible. The three points Q, P, P, would belong to 
C and would lie on a line. Now a convex curve can be cut by a 
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line in only two points. A line could have more than two points 
in common with a convex curve only if it were a supporting line. 
But we know, according to I, that a supporting line has only one 
point in common with a curve of constant breadth. Therefore P, 
must fall on P, and we have r = b. 

In this proof P was an arbitrary point of C, and we have constructed 
the supporting line s of C through this point. Therefore we have 
also proved the result: 

Theorem IV. There ts at least one supporting line through every point 
of a curve of constant breadth. 

A curve of constant breadth may have points at which there is 
more than one supporting line. Such points are called corners. 
In our earlier examples there were many curves of constant breadth 
having corners. At a corner, all the lines that lie in the angle 
formed by two supporting lines are clearly supporting lines them- 
selves (Fig. 100). Therefore a convex curve has a whole bundle 
of supporting lines at each corner. Among these supporting lines 
there are two extreme ones that bound the bundle. 

If P is an arbitrary point of C, then by theorem IV we can draw 5; 
the (or a) supporting line of C, at this point. We draw the per- 


Fig. 100 


pendicular to s through P. It will cut C in an opposite point Q, 
and PQ will have the length 6. The circle of radius 5 with center 
Q will then enclose C and will have s as a tangent. This can be 
put in the form of the following theorem: 

Theorem V. Through every point P of a curve of constant breadth, a 
circle of radius b can be drawn that encloses the curve and that is tangent, 
at P, to the supporting line of the curve, or to a predetermined supporting line 
uf there are more than one. 

8. The following theorem also relates to curves of constant 
breadth and circles: 

Theorem VI. If a circle has three (or more) points in common with a 
curve of constant breadth b, then the length of the radius of the circle is at 
most b. 


170 


CURVES OF CONSTANT BREADTH 


The Reuleaux triangle shows that such a circle may have a radius 
equal to &. If any one of the three arcs is extended to a full circle, 
this circle has radius 6 and has infinitely many points in common 
with the curve. 

In order to prove theorem VI, we suppose that the circle k has 
the points P, Q, Rin common with the curve C of constant breadth 5. 
Of the three angles of the triangle PQR, there is at least one that is 
not exceeded by the other two, be it larger than both the others, 
equal to one and larger than the third, or, perhaps equal to both 
the others. We can suppose that this angle lies at P, and we shall 
callit «. ‘Through P we now draw the (or a) supporting line of the 
curve of constant breadth. Then we draw the circle K of radius ), 
tangent to the supporting line at P and enclosing C. The points 
Q and R will lie inside or on the circumference of K. If both Q 
and R lie on the circumference, then K and k are identical, since 
there is only one circle that passes through three points P, Q, R. 
In this case there is nothing more to prove. 

Otherwise we extend PQ and PR to their intersections, Q’ and R’, 
with & (Fig. 101). We now wish to prove that Q’R’ is longer 
than QR. 

If Q and Q’ happen to be the same point, then R and R’ are 
different, since the case in which both Q and R lie on KF has been 
settled. Here the triangles of Fig. 101 are related, as is shown in 
Fig. 102a. The angle QRR’ = 6 is an exterior angle of the triangle 


R’ R' 
4 
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PQR. Now, by a theorem of elementary geometry, an exterior 
angle is greater then either of the two angles of the triangle which 
are not adjacent to it. In our case we have 6 >a. Also, since B 
is an exterior angle of the triangle QRR’, we have B > f’. Because 
we chose « = f in the triangle PQR, we have 6 > « = B >’, or 
6 > fp’. Therefore QR’ is a side of the triangle QRR’ which is 
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opposite the angle 6, while QR is opposite the smaller angle ’. 
Then, by a well-known theorem of geometry, we have QR’ > QR. 

If Q differs from Q’ and R differs from R’, the triangles are related 
as in Fig. 102b. Because of the theorem concerning the sum of the 
angles in a triangle, we have 6 + y = 180° —a =f’ + y’. It is 
therefore impossible to have 6’ > 8 and y’ > y at the same time. 
Let us suppose that it is the first of these inequalities which does not 
hold. We then have fp’ < f. In the quadrilateral QQ’R’R we draw 
the diagonal Q’R, the diagonal which does not divide the angle f’. 
(In the case y’ S y, we would draw QR’.) Designating the angle 
Q’RR’ by e, we see that it is an exterior angle of the triangle PQ’R, 
and we have ¢« >a. Furthermore, since we already have « = B 
and f = f’, we finally get « > 6’. Then, in the triangle Q’R’R, 
the side Q’R’ is opposite an angle greater than the angle opposite 
Q’R, and we have Q’R’ > Q’R. Since our earlier argument can 
be used to show that we also have Q/R > QR, we finally obtain 
the desired result Q’R’ > QR. 

We have now obtained Q’R’ > QR for every case in which the 
circles k and K (Fig. 101) are different. Now «, the angle inscribed 
in the circle k and subtended by the chord QR, is also subtended 
in the circle K by the chord Q’R’. Therefore the chords QR and Q’R’ 
belong to the same central angle 2« in the circles k and K respectively. 
If we bring these central angles together, we obtain Fig. 103. Here we 
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recognize at once that the larger Uhord belongs to the larger circle, 
and therefore we see that the radius 6 of & is larger than the radius 
of k. This concludes the proof of theorem VI. 

9. The simplest curve of constant breadth which is not a circle, 
the Reuleaux triangle, possesses corners. The following theorem 
shows that the Reuleaux triangle is outstanding among the curves 
of constant breadth because of its corners. 

Theorem VII. A corner of a curve of constant breadth cannot be more 
pointed than 120°. The only curve of constant breadth that has a corner 
of 120° 1s the Reuleaux triangle, which has three such corners. 
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We measure the angle of a corner by means of the two extreme 
supporting lines of the bundle of supporting lines at the corner. If 
a corner Q has the angle #, then the bundle of supporting lines 
occupies an angle of 180° — # (Fig. 104). The perpendiculars at 
Q to all these supporting lines form another bundle that occupies 
an angle of 180° — #, the angle P,QP,. From theorem III we see 
that each of these perpendiculars crosses the curve at a point whose 
distance from Q is Db. 

Therefore the part of the curve opposite the corner Q is a circular 
arc of radius 6 and central angle 180° — #. According to theorem II, 


Fig. 104 Fig. 105 


the length of the chord P,P, cannot exceed the width 5. Then the 
isosceles triangle QP,P, has legs of length 5, and its base cannot 
exceed 4 in length. Therefore the angle P,QP, is at most 60°. 
We have already seen that this angle P,Q P, is 180° — #, so we 
have 180° — # < 60°, and hence # = 120°. Since @ was the angle 
at an arbitrary corner, the first part of the theorem is proved. 
Now if the corner angle @ is 120°, the angle P,Q P, is 60° and the 
isosceles triangle P,QP, is equilateral (Fig. 105). Then P,P, has 
the length 6. Since this length is equal to the breadth of the curve, 
the two supporting lines perpendicular to P,P, must pass through 
P, and P,. From this we can see that P, and P, must also be 
corners of the curve. The part of the curve between P, and P, 
is an arc of a circle, as we have already seen. Not only is 5, 
a supporting line at P,, but so is the tangent ¢#, to the circular arc 
at P,. The inner angle between these two lines is easily seen to be 
120°. Consequently s, and ¢, must be the extreme supporting lines 
of the bundle through P,, since no corner can have an angle less 
than 120°. The corner at P, (and similarly at P,) therefore has 
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exactly the angle 120°. Then P, and P, have just the same proper- 
ties as Q. Opposite each of them is a circular arc of radius } 
and central angle 60°. But this gives us exactly the Reuleaux 
triangle, and therefore the second part of theorem VII is proved. 

10. In this chapter we first obtained some special curves of 
constant breadth by seeing how to construct them. Then we 
proved the general properties given by theorems I to VII. These 
properties hold for all convex curves of constant breadth, but they 
do not say anything about the existence of curves of constant breadth. 
We will now give a perfectly general construction that will yield 
every curve of constant breadth. This will give us a complete 
view of all possible curves of constant breadth. The property of 
our curves shown in theorem V is an especially important one. 
Curves of constant breadth are characterized by this property to 
the extent that one may arbitrarily choose one-half of such a curve 
between two opposite points, so long as it satisfies the conditions of 
theorem V. More accurately stated, we assert: 

Theorem VIII. If a convex arc! I" has a chord of length b, uf the entire 
arc lies between the two perpendiculars to the chord at its ends, and af it has the 
property of being enclosed by every circle of radius b tangent to a supporting line 
at its point of contact and lying on the same side of the line as the arc, then the 
curve can be extended to form a curve of constant breadth b. 

11. In proving this theorem it will be convenient to use the idea 
of regions of constant breadth. Since every region of constant 
breadth is bounded by a curve of constant breadth, we need only 
show the existence if a suitable region. 

Before starting the proof we must make a remark about ‘inter- 
sections’ of regions. Ifa number of regions are given, then the part 
of the regions that is common to all of them is called their inter- 
section. For example, the intersection of the two circles in F ig. 106 
is the shaded region. 

The intersection of an arbitrary set of convex regions is itself convex. 

To prove this we must show that every two points of the inter- 
section may be joined by a line segment that lies entirely within the 
intersection. But this is obvious. For if two points P and Q lie 
in the intersection, they lie in every region of the set. Then, 
because each region of the set is convex, the segment PQ is in all 
the regions. Since it is in all the regions, the segment PQ must 
also be in the intersection. 


* That is, an arc which, together with its chord, bounds a convex region. 
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In this proof it is quite immaterial whether the set contains 
finitely or infinitely many convex regions. 


RED 


¥ 
Fig. 106 Fig. 107 


12. We now suppose that the arc I’ (Fig. 107) has the properties 
required by theorem VIII: the chord AB has the length b. The 
arc, together with its chord, bounds the convex region G,. The 
perpendiculars to AB at A and B are supporting lines of G,.. Every 
circle of radius ) that is tangent to a supporting line of I at its 
point of contact encloses I. 

To G, we add a new region ABS bounded by the chord AB, 
the circular arc AS with center B, and the circular arc BS with 
center A. We shall call this region G,. The two convex regions 
G, and G, together form a convex region G which is bounded by I 
and the arcs AS and BS. 

We now consider the totality of all circles of radius 6 whose 
centers lie on J. The region G and this infinite set of circles have 
a convex intersection D (the shaded part of the figure). We will 
now demonstrate that this region D is a region of constant breadth 
having the arc I as part of its boundary. 

If I belongs to D it must certainly be on the boundary, since it 
is already on the boundary of G. But J’is in all the circles of radius 
6 with centers on IT. To demonstrate that this is true, we need to 
show that no two points of I’can be further apart than the distance b. 
Now, by assumption, J" lies in both the circles of radius b with 
centers A and B, and therefore it lies in the curved figure SATBS. 
Since I’ lies on one side of AB, it must lie in the region G,, which 
is the mirror image of G, in AB. The distance between any two 
points of G, is clearly at most 4, so this must be true in particular 
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for any two points on J. Therefore I” belongs to D and itis a part of 
the boundary of D. Since D is convex, it must contain every chord 
joining two points of I. For this reason G, must be a part of D. 

No two points of D can be further apart than the distance b. 
Since D is a part of G and G is made up of the two regions G, and 
G,, we have three cases to consider: If both points are in G,, then, 
as we mentioned before, they cannot be further apart than the 
distance b. If both points are in G, the same is true. Finally, if 
one point P, is in G, and the other point P, is in G,, we can join 
P, with P, and extend the line until it cuts I’, say at 7? ye three 
points will lie on this line in the order PP,P,. The circle of radius ) 
with center P contains all of D, so it contains these three points. 
Consequently P, and P, lie on a radius of length 5, and hence the 
distance between them cannot exceed 3d. 

The result we have just obtained shows that the region D cannot 
have a breadth greater than 4 in any direction. We must now show 
that it has the breadth } in every direction. In the direction AB, 
the breadth 5 was prescribed by the theorem. We consider any 
other direction and draw the two supporting lines of D perpendicular 
to this direction. One of the two, say s,, will have a point of contact 
QonJI. At Q we draw a perpendicular of length 6 to s, and call 
its end point M. Now M belongs to D. To prove this we must 
show that M is in G, as well as in every circle of radius 6 with center 
on I’. The latter requires that we show that the distance of M from 
each point of "is at most b. This follows from the fact that the circle 
of radius 6 with center M is tangent to s, at Q. According to the 
assumptions, this circle must enclose the arc I, and this shows that 
the distance, from M to any point of Fis not greater than 0. Since 
in particular the distances AM and BM are not more than 3, 
the point M must lie in the figure SATBS. Furthermore, since M 
lies on the opposite side of AB from I, it also lies in G,. Therefore 
it lies in G as well as in all the circles. Consequently M lies in the 
intersection D. 

Since QM is perpendicular to s, and 52, and Q and M belong to 
D, the distance between the supporting lines 5, and s, must be at 
least as large as QM=b. The distance cannot be greater than }, 
since this would mean that the distance between the two points 
of contact was greater than b, and we have seen that this is impossible 
for two points of D. Therefore the distance between 5 and 5, is 
exactly 5; the region D has breadth 4 in every arbitrary direction. 

This theorem shows that the arc I’, satisfying certain requirements, 
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can be extended to form a curve C' of constant breadth b. It can 
easily be seen that there is only one way in which the extension can 
be made, and that the curve C’ is therefore uniquely determined. 

13. Inconclusion we mention, without proof, another remarkable 
property of these curves: All curves of constant breadth having the 
same breadth + have the same perimeter. This perimeter must 
obviously equal the circumference of a circle of diameter b. This 
fact can easily be verified for the examples constructed in §§ 4 and 5, 
making use of the similarity of circular arcs with the same central 
angle. However, the proof for general curves requires ideas and 
methods which are beyond the scope of this book. The proof can 
be started only after a very careful analysis of the concept of the 
length of a curve. 


26. The Indispensability of the Compass for the 
Constructions of Elementary Geometry 


1. The constructions of elementary geometry are all carried out 
with the aid of a straightedge and compass. In fact, a distinguishing 
property of elementary geometry is the fact that the only implements 
allowed are the compass and straightedge. But these two instruments 
are not entirely necessary. There are many constructions in which 
one or the other can be dispensed with. More than this, according 
to the investigations of Mascheroni and the recently found earlier 
work of Mohr, the straightedge can be dispensed with entirely. 
All constructions that are possible with a straightedge and 
compass can be made with a compass alone. Since a line cannot 
be drawn without a straightedge, in these investigations a line is 
considered as being constructed as soon as two of its points are found. 
On the other hand, Jacob Steiner has found that all the constructions 
of elementary geometry can be made using only a straightedge, 
provided only that a fixed circle and its center have been drawn before- 
hand. It is not difficult to prove that this fixed circle is indispen- 
sable. We shall prove this by showing that a fixed circle whose center 
is unknown is not sufficient to allow all the constructions to be carried 
out with a straitghtedge alone. Indeed, two non-intersecting circles 
with unknown centers will not suffice. However, it is known that 
two intersecting circles without their centers, or three non-inter- 
secting circles, are sufficient to replace the Steiner circle with center. 
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2. Since according to Steiner’s result all the constructions of 
elementary geometry can be carried out as soon as we have a given 
circle and its center, what we must prove is that it is impossible, 
using only a straightedge, to construct the center of the circle if we 
have been given just the circle or two non-intersecting circles. The 
proofs of the impossibility of each of these two constructions will be 
made by showing that the assumption of a construction leads to an 
absurdity. These indirect proofs will make use of the principle of 
mapping. 

If we prove that it is impossible to find the centers of two non- 
intersecting circles with the straightedge alone, then clearly the 
impossibility of finding the center in the case of one given circle 
will follow immediately. But the latter case is geometrically simpler, 
so we shall prove it first. Also it will serve to introduce the ideas 
that are used in both proofs. 

3. Suppose we have some way of finding the center of a circle 
drawn on a piece of paper without using anything but a straightedge. 
Our construction will consist in drawing lines which may cut the 
circle or each other, as well as lines which join intersections that we 
have already found. We will have determined the center of the circle 
when we have found two lines whose intersection is the center. The 
whole figure will then be composed of the given circle and a number 
of lines, two of which intersect at the center of the circle. 

We will now study a particular mapping of this whole figure. 
This mapping will carry the given circle into a circle, every straight 
line into a straight line, and the intersection of two lines into the 
intersection of the two corresponding lines in the map. There are 
obviously many such mappings. For example, any proportional 
magnification or contraction of the figure is such a mapping. Buta 
proportional mapping of this sort will not serve our purpose. We 
will have to find a mapping that carries the circle into a circle and 
every line into a line, but that otherwise distorts the figure. In 
particular, we want the center of the circle to be mapped onto a 
point that is not the center of the image circle. 

Once we have found the desired mapping, our proof will be 
practically complete. No matter how much the image differs from 
the original figure, the two are equivalent so far as the construction 
is concerned. Every step of the original proof, drawing a line, 
finding an intersection, joining two intersections, can be carried 
out step by step in the image. But the image of the center of the 
original circle is not the center of the image circle. Therefore the 
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images of the lines that intersect at the center of the original circle 
do not intersect at the center of the image circle. Although the 
construction was carried out step for step in the image, it does not 
lead to the center of the circle in the image. This contradicts the 
assumption that the construction does determine the center of the 
circle, and hence shows that it is impossible, using a straightedge, 
to find the center of a given circle. 

For the case of two circles the proof will be quite analogous. 

4. Now we must find a mapping of the sort we have described. 
We shall obtain it by means of a projection in space. Outside of 
the plane E of the figure (Fig. 108) we mark a fixed point O and 
draw another plane E’, the image or projection plane. Each ray 
through O and a point P on E, produced if necessary, will intersect 
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E" at some point P’. Then P’ is the image or projection of P. 
In the same way, a whole figure in E is projected point for point 
into a figure on E’. The projection can be thought of as the shadow 
cast on E” by the figure in E, if O represents a point-source of light. 
Under the projection every line g will project into a line g’, since 
the totality of rays through O and the points of g forms a plane 
and this plane cuts E’ in a line. 

The projection of a circle will not generally be a circle. The 
totality of rays through O and the circumference of the circle & 
forms a cone. In general this will be an oblique cone. A cone is 
called a “right”? cone if the line connecting its vertex to the center 
of its base is perpendicular to the base. An oblique cone is one that 
isnot right. The projection plane E’ cuts the cone in a conic section 
which is known not to be a circle in general. However, it is essential 
for our purposes that the circle project into a circle. There are two 
particular arrangements of the projection that will accomplish this. 


179 


THE INDISPENSABILITY OF THE COMPASS 


The first, trivial, way is to place the planes E and E’ parallel 
to each other. Then the mapping performed by the projection is 
a proportional mapping, a magnification or contraction according 
to whether E or E’ lies nearer O. This mapping is useless for our 
purposes because it fails to distort the figure. 

The second way to accomplish the mapping depends on a theorem 
of solid geometry. So as not to interrupt the course of our discus- 
sion, we shall postpone the proof of this theorem to § 8. The plane 
that is perpendicular to the base of the oblique cone, and that con- 
tains the vertex O and the center of the base of the cone, is a plane 
of symmetry of the oblique cone. Fig. 109 represents the inter- 
section of the cone with this plane. Only the diameter K,A, of 
the circular base appears in this figure. The plane of the base is 
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perpendicular to the plane of the figure. Because of the position 
of the plane of the figure, the line OX, is the shortest of all the rays 
from O to the circumference of the base, while OK, is the longest 
of all these rays. Every plane parallel to the base plane E obviously 
intersects the cone in a circle. If the plane E’ cuts the lines OA, 
and OK, at K‘, and Ky in such a way that / OK|K,= Z OF,Ay, 
then the theorem that we shall prove in § 8 asserts that E’ intersects 
the cone in a circle. Because we have taken / OK |K,= Z OF,K;, 
we also have / OK,K, = /.OK,K,, since the sum of the angles of 
a triangle is 180°. Clearly any plane parallel to E’ will also cut the 
cone in a circle. 

With E’ chosen as we have just described, the projection from £ 
to E’ has all the properties that we require. We have only to verify 
that the midpoint M of K,K, is not mapped on the midpoint M' 
of KjK;. In the first place, the bisector of the angle at O is the same 
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line, whether we think of the angle as belonging to the triangle 
KX, OK, or to the triangle X,OK,;. Now in any triangle the bisector 
of an angle divides the opposite side into segments proportional to 
the adjacent sides of the triangle. In the oblique cone we have 
OF, > Oj, and therefore K,U > UK, if U is the intersection of E 
and the bisector of the angle at O. Furthermore we have 
Z OR\K, > Z OR,K,, since the larger angle is opposite the larger 
side in a triangle. Then we have / OKjK{> / OK/K, and hence 
OK,> OK, from which we get U’K] > KjU'. Since M is the 
snidpoint of XX, and M’ is the midpoint of KK j, we now see that 
U and U’ lie on opposite sides of the bisector of the angle at O. 
Being in this position, MM’ cannot be the image of M under our 
projection from O. 

5. This completes our proof that the unknown center of a given 
circle cannot be found by means of the straightedge alone. Using 
Fig. 109, we can summarize the main points of the proof. A 
figure in the plane £, consisting of the circle K,K, and certain lines, 
is projected from O onto the plane E’. Under this projection, lines 
go into lines and the circle K\K, goes into the circle K,;K,, but the 
center M of KK, does not go into the center M’ of K[K,. Any 
construction based on lines that determine the center in E will not 
do so in £’. Therefore the construction under consideration is 
impossible. 

6. The mapping is harder to obtain in the case of two given 
circles. Projection from a point O forms two oblique cones, and 
we must arrange it so that the projection plane E’ cuts them both 
in circles. 

We distinguish two cases. First, one of the circles may le inside 
the other. We draw a figure (Fig. 110) whose plane is perpendicular 
to £ and passes through the centers M and WN of both circles. The 
circles are represented by their diameters K,X, and L,L, in the 
figure. Now if we can place the point O so that the angles K,OK, 
and L,OL, have the same bisector, then we can draw E’ in such a 
way that / L,.WO= / L,W’O, using the notation of Fig. 110. 
Then all the angles designated by the same number or letter in the 
figure are easily seen to be equal. Therefore, according to the 
theorem of § 8, the plane £’ cuts both cones in circles K{K, and 
PE. 

The remainder of the proof is the same as before: the projection 
carries the two circles into two circles and it carries lines into lines, 
but the centers of the circles are not carried into the centers of 
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their projections. Any straight line construction in E determining the 
centers will fail in £’, and in general no such construction is possible. 


Fig. 110 


We must still show that we can choose O in such a way that the 
angles k,OK, and L,OL, have the same bisector. For this to be 
true, we must have / K,OW = / K,OWand / L,AOW= / LOW. 
Adding these, we see that we must have / K,OL, = / K,OL,. 
We choose any arbitrary value 6 for these angles and construct the 
isosceles triangles K,C,L, and L,C,K, with base angles 90° — 6 
(Fig. 111). Using C, as center, we draw a circular arc through K, 
and L,. Using C, as center, we draw a circular arc through L, 


Fig. 111 
and K,. These arcs will intersect because their chords K,L, and 
L,K, overlap. We choose O as their intersection. Now we have 
/ K,OL,= 4 7 KCL, = $ [180° — 2(90° — 3)] = 6, 
Z LOK, = 3 Z 1,C,K, = $ [180° — 2(90° — 6)] 0. 
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Consequently we have / A,OL, = / L,OK,, from which we find 
Poe, = ROL: Therefore ‘the bisector:of -7 L,0L, is also 
a bisector of / K,OK,, as was required. 

7. In the second of our two cases the two given circles lie outside 
of each other. UHere the two cones will be outside of each other, 
and it will be impossible for the angles K,OK, and L,OL, to have 
the same bisector for any position of O. 

In this case we will have to use the other nappe of the cone (Fig. 
112). The theorem of § 8 wil! apply to both cones if 7 Z,VO = 
Z1,V'0 and 7 K,UO = 7 KjUO_ Then the triangle UVV’ will 


Fig. 112 


be an isosceles triangle, and UO will bisect its vertex. Therefore 
UO is perpendicular to VV’, and we have 

Z. K,OL, = 90° + « — B, 

Z. K,OL, = 90° — a + 8B, 
and consequently 
(1) ZK,OL, + 2 K,0L, = 180°. 
Therefore the point O must be chosen to make (1) true. Also, if 
(1) is true, then the bisectors of the angles K,OK, and L,OL, will 
actually be perpendicular, for we have 

2/7 UOV=2 / UOK,+27,0L,+ 2 7 LOvV 
= ZA,OU+ LUOK,+ LA,OL, + LROL,+ LL,OV+ ZVOL, 
= . fe A0L, be eS hoOl. ——- 180°, 

and therefore 


Z UOV = 90°. 
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If this is the case we can place E’, as shown in the figure, in such a 
position that the angles 2 are equal, and then the angles 1 will also 
be equal. Then £’ cuts both cones in circles, the projection has the 
required properties, and the remainder of the proof is the same as 
before. 

In order to find the position of the point O satisfying (1), we 
choose arbitrary values / K,OL,=gq and / K,OL, = y, subject 
to the restriction g + y = 180°. Then, following the method used 
at the end of § 6, we draw a circular arc in which the angle » can 
be inscribed over the chord K,LZ,. Similarly, we draw an arc in 
which y can be inscribed over K,L,. Since A,L, and AL, overlap, 
the circular arcs will intersect. Their intersection is the point O 
that we are looking for. 

We have now proved that two non-intersecting circles without 
centers are not sufficient to make all elementary constructions 
possible with a straightedge alone. 

8. There still remains the proof of the theorem postponed from 
§ 4. This theorem asserts that if the plane E (Fig. 109) tntersects 
the oblique cone K,OK, in a circle, then the plane E’ will also intersect the 
cone in a circle if the angles K,K,O and K,K,0 are equal. It must be 
remembered that the plane of the figure is the plane perpendicular 
to the base of the cone that is determined by the vertex O and the 
center M of the base. The condition / KjK,0 = ZK,K,0 can 
clearly be replaced by 7 K,K,;O = 2 K,K,0 or by 2 K.U'0 = 
Z. KUO. 

The plane of Fig. 109 is a plane of symmetry of the cone. Our 
proof will depend on finding another plane of symmetry. 

Fig. 113 shows the same cross section of the cone as is depicted 
in Fig. 109. We have drawn the circle circumscribing the triangle 
OK,K, as well as the bisector MO of the angle at O. Since the 
angles K,OM and K,0M are equal, the arcs K,M and &,M which 
they intercept are equal. Therefore the perpendicular from the 
center C of the circle to the chord A,K, intersects the circle at the 
point M. If we rotate the circle about the axis M’CM, it describes 
a sphere. The circumference of the base K,k, and the vertex O 
of the cone lie on this sphere; the cone is inscribed in the sphere. 
Therefore the point M is equally distant from all points on the 
circle K,K3. 

The line OM plays a special role in the study of oblique cones. 
It is called the axis of the cone. Ifa plane is passed through the axis, 
it cuts the cone in a triangle with one vertex at O and the other 
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two, H, and H,, on the base of the cone. This plane cuts the 
sphere in a circle that circumscribes the triangle OH,H,. The 


Hy Mp 
M 


Fig. 113 Fig. 114 


point M also lies on this circle, and we obtain a figure (Fig. 114) 
that is completely analogous to Fig. 113. Since the two chords 
HM and H,M are equal because of the relation of M to the circular 
base of the cone, the axis OM is the bisector of the angle H,OA,. 
The axis of an oblique cone has tne property that every plane through tt cuts 
the cone in a triangle in which the axis 1s a bisector of an angle. 

Now we cut the cone with a plane A perpendicular to the axis 
of the cone and to the plane of Fig. 113. The intersection of A 
and the plane of the figure is shown as the dotted line AA in Fig. 113. 
The intersection of A and the cone is represented in Fig. 115. The 
plane of this figure is the plane A, and the axis of the cone is perpen- 
dicular to it. The plane of Fig. 113 cuts the plane of Fig. 115 in 


Fig. 115 


the line <X. We consider an arbitrary plane through the axis. 
It cuts the plane of Fig. 115 in some line, say BB. The points 7, 
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T, represent its intersection with the curve. The vertex O is not 
in the plane of Fig. 115 but it stands directly above the point U. 
In the triangle 7,O7,, the axis UO is perpendicular to T,T> and, 
by our above result, it is the bisector of angle 7,07. Therefore 
we have 7;U = T,U. Since B was any plane through the axis, 
this means that the point U is a center of symmetry of the intersection 
of the cone and the plane A. 

The plane of Fig. 113 is a plane of symmetry of the cone, and its 
intersection with the plane of Fig. 115 is ZZ. Therefore Fig. 115 
is symmetrical with respect to both the line ZZ and the point U. 
In other words, the mirror images T* and T¥ on the cone cor- 
respond to 7, and TJ, on the cone. 

The four points form a quadrilateral T,7*7,7*. Since T,U and 
T,U are equal, we have T{U == TU in the mirror image. Further- 
more, / T,UTf = £ T,UTY, since they are vertical angles. Then 
the triangles 7 7,U and T¥T7,U are congruent, so we have T,T*= 
T,T3. The lines 7,7{ and T,T} are also parallel, since they are 
both perpendicular to. Therefore T,T}T,T* is a parallelogram. 
Since the two diagonals 7,7, and T*T7 are equal, the parallelogram 
is a rectangle and U is its center. This shows that Fig. 115 also 
has the axis of symmetry YY. 

Our result shows that the plane through the axis, that is per- 
pendicular to the planes of Figs. 113 and 115, is a second plane of 
symmetry of the cone. It is only necessary to notice that this plane 
contains the line YY which is shown in Fig. 115 and which passes 
through U perpendicular to the plane of Fig. 113. 

If we reflect the cone in this second plane of symmetry, its image 
coincides with the cone itself; this is just another way of stating the 
symmetry. However, the image of the circle K,K, is another circle 


Ke 
Fig. 116 


K,K,, which must also lie on the cone (Fig. 116). This is the circle 
we have been looking for. The angle between the plane of this 
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circle and the axis of the cone is in agreement with the statement 
of the theorem. Finaily, we note that the circle K\k, cannot 
coincide with its image, since this could happen only if its plane 
were perpendicular to the axis of the cone. But this is in contra- 
diction to the assumption that the cone is oblique. 


27. A Property of the Number 30 


Neither 10 nor 21 isa prime number. But 10 = 2:5 and 21=3:7 
have no divisor that is common to them both. For this reason they 
are called “relatively prime” numbers. The numbers 6 and 10 
are not relatively prime; they have the common divisor 2. 

Of all the numbers from 1 to 9, the numbers 3, 7, 9 are relatively 
prime to 10. Although 9 is relatively prime to 10, it is not a prime 
number. In the case of 12 the situation is different. Of the num- 
bers from 1 to 11, only 5, 7, 11 are relatively prime to 12, and 
they are all prime numbers. It can easily be seen that this property 
of 12 is shared by the numbers 


3,456, By 12° FSi eas 30: 


Is 30 the largest number that has this property, that all the numbers less 
than it and relatively prime to it are prime numbers? This chapter will 
be devoted to show that this is so. 

We start by seeing how we might set about looking for numbers 
with this property. From 4 on, every such number WV must be 
divisible by 2. For if it were odd, 4 would be relatively prime to 
it, while 4 is not a prime number. In the same way, every such 
N from 9 on must be divisible by 3. Since it is already divisible 
by 2, it must be divisible by 2: 3. Continuing this argument, we 
obtain the table: 


From 4 on, WV must be divisible by 2: st 2 
From 9 on, WN must be divisible by 2:3 6 
From 25 on, NV must be divisible by 2 9 1: ae. 30 
From 49 on, NV must be divisible by 2:32 5°39 210 
From 121 on, N must be divisible by 2° ooh AI == 2310, 


Between 4 and 9, the only possible values for NW are 4, 6, 8; between 
9 and 25, only 12, 18, 24; between 25 and 49, only 30 (60, the 
next multiple of 30, is larger than 49). Between 49 and 121 there 
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are no possibilities, since 210 is already larger than 121. Now we 
see that if the table continues in this way, that is if 13? is less than 
2-3-5-7-11, 17? is less than 2:3-5- 7-11-18, and so on, then 
30 is the largest number having the property we require. If we 
represent the successive prime numbers 


ee ti ie .. 


by the symbols 
Py, Po Ps, Par Ps, Pes Pr °°" "> 


as we did in Chapter 1, then we must show that 


(1) Duis < fi pops eS ‘By 
is true for all n from 4 on. 

Euclid’s proof, which is reproduced in Chapter 1, shows that we 
have 


Pai < Pile’ ** Pn: 
What we need is 


Pasi < VPiPobs*** Dns 


and this asserts more than Euclid’s result. The inequality (1) is 
probably of as much interest as the original problem concerning 
the number 30. Since the original problem is solved as soon as 
we have (1), we shall concentrate on the proof of (1). 

The inequality (1) asserts far less than is actually true. Not only 


is the next prime number after p, = 11 less than V2-3-5-7: 11= 


1/2310 = 48.06 ---, it is only 13. The discrepancy becomes even 
greater as we go on. However, because of the tremendous irre- 
gularity of the primes, it is very difficult to obtain results that are 
valid for all primes. With the aid of extensive methods, Tscheby- 
scheff proved that the next prime after p,, is less than 2,, that is, 
Pnii < 2f,- This is much more than we need here, so the question 
arises whether our lesser assertion (1) cannot be proved, through 
the use of elementary methods. 

H. Bonse, as a student, discovered an ingenious proof of (1). It 
not only avoids all the analytical methods and infinite processes 
that were used by Tschebyscheff, but it uses only the very simplest 
mathematical ideas. 

1. The basic idea of the proof is similar to that of Euclid’s proof, 
as given in Chapter 1. Instead of forming the expression 


N = pipe’ ** fat or M = pipe’ ** fn — 1 
out of the first n prime numbers, we use only the first 2 prime numbers 
f1,°**, P; and form the fp, expressions 
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M, = fife ** fiat 1 — 1, 
My, = pipo*** Pir’ 2—1, 
My; = Pipe***fi1°3 — 1, 
M, = pipe’ **hi1°4— 1, 


M,, = pppoe **Piisp; — 1. 


As in the case of Euclid’s expression, we can assert: 

(a) None of the expressions M,,---, M,, 1s divisible by any of the 
prime numbers py, fo, ***, Ps 

(b) At most one of these expressions 1s divisible by p,. For if two of 
them, say ~,°°*p,;.x — 1 and p,:+:+/,;1»” — 1, were divisible by 
p;, their difference p,- ++ ;-1(* —_y) would also be divisible by ,. 
Since p, does not divide any of the first 2 — 1 factors, it would have 
to divide (x —_y). Butxand_y are among the numbers 1, 2, 3, 4, 5,:°-, 
p;, so their difference is at most p; — 1, which is less than p,. There- 
fore p, cannot divide this difference, since a larger number cannot 
divide a smaller one. This same proof shows that we can also 
assert: 

(c) At most one of the expressions is divisible by p,4,, at most one by 
Pisa, °° *, at most one by p,. 

Now if there are fewer of the numbers f,;, £;41,°°°; P, than there 


are of the expressions M,,---, M,,, in other words, if we have 


(2) Rat) < pp 

then at least one of the expressions M is not divisible by any of the 
numbers ,;, P:s3,° °°; Pn» This important step in the proof follows 
directly from (b) and (c). If we call this particular expression 
M,, then M, is not divisible by any of the prime numbers fy, «°°, p,, 
since (a) shows that it is not divisible by f,, ++ °, f;-1. 

The next step follows Euclid’s proof. Either M, is a prime 
number or it can be factored into prime factors. There is a prime 
number f/ which is either equal to M, or divides M,. Since M, 
is not divisible by any of £,,:--, ,, this prime number p must be 
beyond p,. The next prime number after p, is ~,41, SO we have 
Pnii =p. Also, since p divides or is equal to M,, we have p S M,,. 
The largest of all the expressions M is M,,, so we can put these 
inequalities together to obtain 


Paag SM Py PAPE Ol Pp tp, 
We can summarize our results in the statement: Jf (2) holds, then 
we have 
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(3) Pri < fi? ** pi 
2. The result of § 1 is an improvement of Euclid’s result, 
Praia < fi’ ** Pn 


The number 7 is less than n, so the right side has been decreased. 
How much have we actually gained? The condition (2) does not 
allow us to choose an arbitrarily small value for 7. We must keep 
1 large enough so that the number of numbers ,, --«, p,, that is, 
n —1i-+ 1, is less than p,, the first of these numbers. This require- 
ment is complicated by the interplay of 7 with itself. A simple 
example will help us to see how it behaves. 

Let us take n = 5, so that we are considering the 5 prime numbers 
2, 3, 5, 7, 11. If we choose p,; = 3, we have 2 = 2, and the group 
of numbers /,,°-°*, P, consists of 3, 5, 7, 11. This group consists 
of n—i+t+1=5—2-+1=4 numbers. The number of these 
numbers is not less than the first of them (4 is not less than p; = 3), 
so we have chosen too small a value for 7. If we increase z by 1, 
to i = 3, we have p, = 5, and there are only 3 numbers 5, 7, 11. 
Since 3 is less than 5, this is a suitable choice for 7. Obviously 
any larger choice of z would also satisfy (2). 

The result that we shall prove in this section is that zf 2 1s chosen 
as small as possible, satisfying the condition (2), then we will have 


(4) Py * Pi < Pits’ * * Pa 

This is true for n = 5, as can be verified by multiplication, 
2-3-5 < 7-11. In order to see that it holds for further n, we 
must see how the optimal z changes with increasing n. 

For n= 5 we had i=3, p,=5. The group of numbers 
fis ***; P, contained 3 numbers, 5, 7, 11. If we change from 
n = 5 to n = 6 we bring in another prime number 13. However, 
we need not change i, since the group of numbers ,,°--, p, can 
contain 4 numbers. If we change to n = 7, we bring in the prime 
number 17. We must now increase 1, since otherwise the group 
Pi ***s Py Will contain 5 numbers and 5 would not be less than 
fp; = 5. Consequently we must choose: = 4ifn = 7. Nowi= 4 
implies p,; = 7, and this will permit a group of 6 numbers, 7, 11, 
13, 17, 19, 23. That is, 7 = 4 will suffice for n= 7, 8, 9. For 
n = 10, the value of i must be increased to i = 5. Then #, will 
be 11. Since this represents a jump of 4 over the previous p,; = 7, 
this value of z will suffice for 5 values of n, for n = 10, 11, 12, 13, 14. 
The value of p, is shown in the following table by means of bold 
face type: 
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eo - 8: 213.5, 75-11 

Baste e528, 0,7, 11, 13 

Mee ftee 2.356, 7,11, 13,17 

Pestana. SF 7.11.13, 17, 19 

eee eee G6 7,11, 13, 17, 19, 23 

Sees oto G. 7, 11. 13, 17, 19,-.23,-29 

etter Sat G7, 11, 13, 17, 19;-23; 29,31 

eee teees, 5, 7, 11, 13, 17, 19, 23, 29, 31,37 
eee = S65. 7, 115 13,27 p-99;: 23, 295191, 97, 41 
= Wns, a, 5, 7, 11, 18, 17, 19, -28,(29; 31, 37;41, 43 
Seem s, b, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47 


ae ee 8 @ © © 6.86 @ 6 © © ¢ & % 6 * 6. a e © 6 6 OS 6 4 Oe 6 6-8 6 8 6 8 6 6 ele lee 


Each time that 2 increases by 1, the prime number f, increases 
by at least 2, and this allows 7 to remain the same for the next 3 
values of n. If it happens that f, increases by more than 2, the 
value of 7 remains constant for more values of n. 

Now that we have seen how the optimal z changes with increasing 
m, we can discuss the validity of (4). We already know that (4) 
is true for n = 5, 


(5) Fe Tk 
Going to n = 6, 2 does not change, and (4) asserts that 
(6) ate 8 eo Tide 


and this is clearly true, since the right side of (5) has been increased. 

Going from n = 6 ton = 7 is not quite so simple because 7 changes 
its value here. The number 7 moves from the right side to the left, 
and the new prime 17 appears on the right. What we wish to 
prove is that 
(7) 9eS95 YS 11: 13° 17. 

Without actually making the computation, we cannot obtain (7) 
directly from (6). We could multiply the left side of (6) by the 
factor 7 and the right side by the larger factor 17, but this would 
yield 

O58. Tee 7: 1h: 15: 17. 
in which the unwanted factor 7 appears on the right. 

If we start with (5), however, we can verify (7) without com- 
putation. Multiplying the left side of (5) by 7-7, the right by 
13-17, we obtain 

2° 3s GT s Fac] <- 48 £17. 
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If a factor 7 is canceled out we have the inequality (7). 

This argument carries us from n = 6 to n = 7 without any actual 
computations. It depends on the fact that we can go back two 
steps to nm = 5. Exactly the same argument can be carried out for 
every similar step from one value of z to the next. It depends only 
on the fact that whenever the value of 2 increases, it then remains 
the same for at least two consecutive values of n (we have seen that 
it remains the same for at least three and often more values of n). 
Therefore the inequality (4) is true for all n = 5, 6,---. 

3. If we multiply both sides of (4) by p,-:-+,;, we obtain 


ee = ee 


or 


(8) Bio bp < Vy Ope. 

This, combined with (3), gives the desired result (1) for n = 5, 6,:°:-. 

That (1) is also true for n = 4 is easily verified by multiplying it out. 
4, Bonse carried the discussion a little further. The inequality 


(9) Prii< Vii ** Pn 

for n > 5, can be proved by the same methods. The decisive point 
in the proof is the fact that the values of 2, discussed at the end of 
§ 2, actually remain the same for 3 values of n, and not merely for 
2 steps, as in the proof just completed. 


28. An Improved Inequality 


In Chapter 27 we mentioned that Bonse’s proof of (8) actually 
gives the better inequality (9). As a matter of fact, the addition 
of one simple idea will allow us to prove even a little more in one 
direction, although, as we shall see, we will lose something in another 
direction. 

The new idea is this: If M is a number of the form 6m — 1 (a 
multiple of 6 decreased by 1), then in the decomposition of M into 
prime factors there must appear at least one prime which is also 
of the same form, 6x — 1. To see that this is true, we notice that 
every number is of one of the forms, 6x, 6x — 1, 6x — 2, 6x — 3, 
6x — 4, or 6x — 5. Now 6x, 6x — 2 and 6x — 4 are all even 
numbers, so 2 is the only possible prime among them. Also, 6x — 3 
is divisible by 3, so 3 is the only prime of this form. All that remain 
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are 6x — 1 and 6x — 5, so every prime number is either one of these 
two forms oris 2 or 3. However, 2 and 3 cannot divide M, which 
is of the form 6m — 1. Furthermore, the product of two numbers 
6y — 5 and 6z — 5 is 
(6y—5) (6z—5) =36yz—30y—30z+ 25= 6 (6yz—5y—5z+5) —5 

which is of the same form again, so, in order for M to have the 
form 6m — 1, its decomposition must contain at least one prime 
factor of the form 6x — l. 

In order to obtain our new inequality, improving over Chapter 27, 
we do not use all the prime numbers fy, fos, p3, Pas °° * = 2; 3, 5, 7,°°° 
as we did before, instead, we take only 4, 72, 73, Ja. °° * = 2,3, 5, 11,°. 
We take g, = 2, q = 3 and for the remaining q’s, the prime numbers 
of the form 6x — 1. 

Now we form the expressions M,, M,,---, M,, just as before, 
but we use the q’s instead of p’s: 


M, == Gis =2 Gy: b.--l, 
Ms, = 92° °° Ga 2 — 1, 
M,; = hhfe° °° Ga’ 3 — 1, 


M,, = h92° °° UA’ 4 1, 


The statements (a), (b), (c) that we made concerning the original 
expressions are still true if we only read q in place of p. 

The next step needs a little explanation. Just as before, if 
(2*) n—i+l<q 
then there is some M, call it M,, that is not divisible by any of the 
primes q,°°*;n- If M, is a prime itself, it is a prime of the form 
6x — 1 which comes after all the q,,°°°, 7, If M, is not a prime, 
by our first remark there is at least one prime of the form 6x — 1 
that divides it. Since this prime cannot be any of 4, °°; Qn, it 
comes after g,. In either case we find that there is a g of the form 
6x — 1 that divides M, and comes after g,. Since ¢,4; is the next 
prime of the form 6x — 1 that follows g,, we have gn41 <q and, 
since g divides M, we have g [ M,<%°°°q;- Combining these 
results, we have 
(3*) Inti <4 
if (2*) holds. 

Now, taking the smallest possible value for 7 still satisfying 


(2*), we make a table showing the value of g,; for each n, as we did 
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in § 2. Since the table becomes rather large, we shall write down 
only certain parts of it: 


n= 6: 925. 5, TL 17, 23 

Rox 4: 2g, a, DO, Sh, 17, 23, 20 
na 8)°2, 8,5 11a B9- og - 
nse 952559350 5,: 215-17, 28, 29, =<: 
n= 10::2, 3,5, 11,17, 23, 29,.--- 
we ils 2, 3 8 48 47 23 36 
n== 12:-2, 3. 6, 11,-17,,. 23, 29, -- 
n= 18: 2,°3,.5, -11,.17, 23, 29, 
n= 14: 2, 3, 5, 11, 17, 23, 29, 
n= 20: 2, 3, 5, 11, 17, 23, 29, 


n=114: 2, 3,5, 11,17, 23, 29, 41, 47, 53, 59, 71, 83, 89, 101, 107, 113, -- 
n=115: 2,3, 5,11, 17, 23, 29, 41, 47, 53, 59, 71, 83, 89, 101, 107, 113,--: 
n=121: 2, 3,5, 11,17, 23, 29, 41, 47, 53, 59, 71, 83, 89, 101, 107, 113,-:- 
n=122: 2,3, 5, 11, 17, 23, 29, 41, 47, 53, 59, 71, 83, 89, M1, 107, TiS 
The first thing to notice about this table is that whenever q; in- 
creases, it increases by at least 6. This allows 7 to remain the same 
for at least the next 7 values of n. Because of the long pause that i 
makes, we will be able to prove 


(4*) (ty 793) < Gea 

In fact, if this is true for part of the table up to any value of n, and 
if 2 does not change for the next value n + 1, then (4*) still holds 
for n + 1, since only the right side is increased. If i does change, 
however, we can go back 7 steps to n — 6 without changing i and 
still have 


(91:3? “G,)* <GGy Oo 

If we multiply the left side by g? and the right by 

Gn—5 I n—4 I n—3 Yn-29n—-1 QnIn+i> 
which is larger, we obtain 

(1° *° 9) Gia < Goa = Yn+1° 
Dividing both sides by g;,,; gives us our inequality for n + 1: 

(ge "* “gua)* =O gee ia 
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We have still not completely proved (4*). We have only seen 
that it is true for the value m+ 1, provided either that it is true 
for n and z does not change, or that it is true for nm — 6 and 7 does 
change. We shall see that it is true for nm = 114 and 115. Since 
n = 115 is the first of at least 7 steps in which 2 does not change, 
(4*) will then be proved for all n = 114. It is easy to see that (4*) 
is true for nm = 114 and 115. For nm = 115 the eft side is the 6-th 
power of a product of 2 = 16 factors; that is, the left side is the product 
of 6- 16 = 96 factors. The right side is the product of n —i = 
115 — 16 = 99 factors, and each factor on the right is larger than 
each one on the left. The proof for n= 114 is similar. This 
proof is very crude in that it does not make use of the fact that the 
numbers on the left are much smaller than those on the right. 
By making more careful computations we could probably reduce 
the size of n somewhat, but the computations would be quite lengthy. 
One fact is fairly obvious: (4*) is not true for n = 6. 

If we now multiply both sides of (4*) by g,--+-9¢, and take the 


7-th root, we have 


(8*) qt igaeag,** 
if nm = 114. Combining this with (3*), we have 
(9*) Int < Veg, 2 = 114. 


This compares with Bonse’s inequalities (1) and (9) of Chapter 27, 
but it is proved for our special primes q,, qs, °° °. 

In order to get an inequality like (9*) for the set of all primes 
Py, fo, ***, we first notice that our proof of (9*) shows that the 
sequence of special primes g is not bounded. Therefore, if p, is any 
prime, we can find a q, such that ¢, Sp, <qn4,. Since f,,, is 
the first prime of any sort that comes after p,, we have ~,4;<dn41. 
Also q*** Qn Sfy°** ,, since the right side includes all the primes 


91> °° *s Yn, aS well as some that are not of this form. Combining 
these facts with (9*), we have 
(9**) fa Why =o *f,- 


We have proved this only for p, = q,, n = 114, that is, for p, => gy. 
This is our improvement of Bonse’s inequality. We can say that 
(9**) is true if r is large enough. By checking through a list of 
prime numbers, we could find ¢,,, and could then say just how large r 
must be. Even then (9**) would still probably be true for many 
smaller values of 7, and these could be determined by actual com- 
putation. However, the main interest of (9**) is that it is true as 
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soon as r passes a certain value and is always true from then on. 
The exact value of this r is of less interest. 

In proving (9**) we have improved Bonse’s inequality by replacing 
the square or cube root by the seventh root, which is considerably 
smaller. However, we have lost something in the process. Bonse’s 
inequalities are true for all except the very smallest values of n, 
n = 4 and n = 5, while we have proved (9**) only for large values 
of 7. 

The reader should recognize that this discussion has required 
no mathematical knowledge other than the very simplest fundamen- 
tals, which were also used in Chapter 1. The proof depends 
entirely on pure reasoning. Because of this it shows clearly how 
ingenious and how difficult mathematics can be. In some cases it 
reaches its goal by combining and extending its numerous branches, 
but it also reveals its true spirit in examples such as this one, where 
the argument is developed with the aid of a very minimum of mathe- 
matical knowledge. If this last chapter seems to require a difficult 
chain of thought, if it shows how mathematics can build a real and 
meaningful structure on such a small foundation, then it probably 
exhibits most clearly the real motive of this book. 
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CHAPTER 1 


In the words of Euclid (Elements IX, 20), ‘‘Prime numbers are more than any 
assigned multitude of prime numbers.” 

For the proof concerning the gaps in the series of primes see Kronecker, Vorl. 
tiber Zahlentheorte 1, Leipzig 1901, p. 68. 

The existence of infinitely many primes in some other sequences, for example 
1, 5, 9, 13,--+, 4n + 1,°:+>+ or 3, 7, 11, 15,--+, 4n — 1, +--+, both with common 
difference 4, can also be proved by elementary means. However, the sequence 
2, 6, 10,---, 4n + 2,---, with the same difference 4, contains only the single 
prime number 2 because all of its terms are even. In general, the theorem 
holds for series having any common difference if the first term is relatively prime 
to the difference. This was proved by means of higher mathematics in a famous 
and difficult paper by Dirichlet (1837), (Abh. d. preuss. Ak. d. Wiss. p. 45—81, 
or Werke I, p. 307—342), 


CHAPTER 2 


The representation of the networks as streetcar tracks may make it appear 
that the net of curves must lie in a plane. This is not necessary. All the results 
of this section are valid for networks of curves in space. This is in contrast to 
the subject of Chapter 10, which has a certain external similarity, but which 
does not carry over to space without change. 


CHAPTER 3 


§ 2. The discussion can be carried over directly from the circle to an ellipse. 
The role of the equilateral triangle is then taken over by the triangle for which 
the tangent at each vertex is parallel to the opposite side. 

§ 3. The preliminary step was suggested by E. Steinitz. 


CHAPTER 4 


Anaxagoras—see Diels: Die Fragmente der Vorsokratiker I, no. 46, vol. 3, 2nd 
edition, p. 314,¢_15. 

Plato: Laws, VII, 819d,-820c,. 

The first proof is not to be found among the works of the Greek mathematicians, 
but it is the type of proof that they could have produced and they may have 
known it. The second proof is given by Euclid, Elements X, but an indication 
in Aristotle, Analytica priora, 41a,,, 50a3;,, seems to show that it is older than 


Euclid. 


CHAPTER 5 and 6 

Schwarz, H. A.: Ges. Abh. II, p. 344-345; see also Steiner, J.: Werke II, p. 
728-729; cf. also p. 95, no. 7 (= Crelle, 16, 1837, p. 88), where the assertion is 
given for both the plane and spherical triangle, and p. 238, 3. 

Fejér’s proof is not printed elsewhere. It has been reproduced with the kind 
permission of the author. 

It is clearly necessary that the triangle have acute angles, since only in that 
case are all the altitudes inside the triangle. In an obtuse triangle the pedal 
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triangle would not strictly be an inscribed triangle. In a right triangle, the 
pedal triangle reduces to a line. 

In Schwarz’s proof the fact that the triangle is acute is used in assuming that 
the pedal triangle is inscribed in the original triangle. This is used in the reflected 
figure. In Fejér’s proof the pedal triangle does not appear until the end. The 
acuteness of the triangle comes into play in keeping the angle U’AU”’ less than 
two right angles so that the intersections M and WN, of U’U” with AC and AB, 
lie on these sides and not on their extensions. Furthermore because the triangle 
is acute, the foot EF of the perpendicular is on the side BC and not on its extension. 

An advantage of Fejér’s proof is that it remains valid in non-Euclidean geometry. 
This is not true of Schwarz’s proof, since it uses the fact that the sum of the 
angles of a triangles is 180°, as well as the Euclidean idea of parallel lines. In 
particular, since the sides of acute-angled spherical triangles are less than quadrants, 
Fejér’s proof remains valid step for step in the case of a spherical triangle. 


CHAPTER 8 
Viggo Brun has used a more general combinatorial function relating to the 
number of combinations. See Netto, Lehrbuch der Combinatorik, 2nd edition, 
edited by V. Brun and Th. Skolem, Chap. 14, Leipzig, 1927. Brun has also 
discussed the particular function of this chapter in L’Enseignement Mathématique, 
vol. 29 (1930), p. 231-237. 


CHAPTER 9 

Bachet, in his edition of Diophantus’ Arithmetica, has stated that a part of the 
Arithmetica implicitly includes the theorem on the representation of a number 
as a sum of four squares. Fermat gave a sketch of a proof in a letter, and Euler 
and Lagrange, Works, vol. 3, p. 189-201, 1869, proved the theorem in the 
manner indicated in the text.. 

Waring, Meditationes Algebraicae, 3rd edition, p. 349, Cambridge, 1782, conjec- 
tured that 9 cubes, 19 fourth powers, etc., will suffice. See also C. G. J. Jacobi, 
Werke, vol. 6, p. 322; Wieferich, Math. Ann., vol. 66, p. 95-101, 106-108, 1909; 
Hilbert, Math. Ann., vol. 67, p. 281-300, 1909, Hardy and Littlewood, Math. 
Keitschr., vol. 23, p. 1-37, 1925. 

J. Liouville presented his proof at the Collége de France; it is printed by 
Lebesgue in the Exercises d’analyse numérique, p. 113-115, 1859. 

The “‘etc.’’ in Waring’s conjecture may be interpreted to mean that any natural 
number n can be represented as a sum of 


I= 2+q—2 


kth powers, where q is the greatest integer not surpassing (3/2)*. So many Ath 
powers are indeed needed for the number 


n= 2g — 1 


for which, since it is smaller than 3*, only the summands 1” and 2* can be used. 
Now Dickson and Pillai proved in 1936 and the following years that for k = 6 the 
number J of kth‘ powers indeed suffices for all n if (with Niven’s later improvement) 


a\t 
This condition holds for 2 < k < 400. Whether it is true for all & is not known. 
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However for thosek for which it might not be valid the number g() of required 
kth powers has also been determined. For k=2, 3 the number J (= 4, 9 respectively) 
is also good. But for k = 4, 5 we know at present only 


19 <e(4) <35, 37 <g(5) < 54. 


The work of Dickson and Pillai rests throughout on the results which Vinogradoff 
obtained with his functiontheoretical methods that he developed from the methods 
of Hardy and Littlewood. 


CHAPTER 10 


§ 1. Gauss, Werke, vol. 8, p. 272, 282-286; see also Julius v. Sz. Nagy, Math. 
Aeitschr., vol. 26, p. 579-592, especially p. 580-581, where the theorem is differ- 
ently formulated and proved. 

§ 4. It is not self-evident that a closed curve that is free of double points 
separates the plane into two regions. The theorem is not true for all surfaces. 
It fails, for example, on the torus. Therefore the theorem must be stated as being 
true on the plane, and it requires a proof. A proof was first given by C. Jordan, 
and the theorem now bears his name. Since Jordan’s theorem does not hold 
for the torus (Fig. 117), our whole argument fails for this surface. In fact, a 


Fig. 117 Fig. 118 


curve can be drawn on the torus with two double points in the order 1212 (Fig. 
118). 

§ 5. Concerning alternating knots, see Tait, Trans. Edinburgh Philos. Soc., 
vol. 28, p. 145, 1879. : 

The knot of Fig. 34 can be deformed, without tearing it, into the knot of Fig. 33, 
an alternating knot. However, it is not true that every knot can be deformed 
into an alternating knot. C. Bankwitz, Math. Ann., vol. 103, p. 161, 1930, has 
given an example of a knot whose projection is never alternating. 

Topics similar to those of §§2 and 10 are discussed by J. Petersen, Acta. Math., 
vol. 15, p. 193-220, 1891. 


CHAPTER 11 


The theorem of the uniqueness of prime number factorization is not explicitly 
mentioned by Euclid. However he proves that if a product is divisible by a prime 
then at least one factor must be divisible by that prime (Elements VII, 24, 29), a 
theorem from which the uniqueness of prime factorization follows at once. For his 
proof Euclid makes use of the greatest common divisor, instead of the smallest 
common multiple. 

A simple proof, using mathematical induction, has been found in the twentieth 
century (independently by Zermelo, Hasse, Lord Cherwell), which should be 
mentioned here. 
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The first few natural numbers certainly have a unique prime factorization. 
Indeed, leaving 1 aside as a unit, the numbers 2 and 3 are themselves prime 
numbers. Suppose there were numbers with two different prime factorizations, 
then there must exist a smallest among them, say 


N= py fae *Pr= °° oN 
where the f and g are prime numbers, which in each product we assume ordered 
according to their magnitude 


fi Sty S&*** SS oe q1 Sa BS? Sq. 


All p must be different from all g, for if any p; = g; then we could cancel these 
factors and would obtain W/p; as a smaller number than WN with two different prime 
factorizations. Without loss of generality we can assume f, < q,. We then form 


M = f1'° 92° UNS N's ONSEN. 


This M is divisible by #,, as NV is. Therefore the difference V* = N — M is also 
divisible by p, and thus possesses a prime factorization which includes p,. On the 
other hand we have 


N* = (4, — fi)ga°* 


in which g, — ; is not divisible by ,, nor is any of the primes g,, « - - g, equal to p,. 
This means that N* has also a prime factorization in which p, does not appear 
and thus possesses two different prime factorizations. But W* < N against our 
hypothesis that WV is the smallest of such numbers. This contradiction disproves 
the existence of numbers possessing two different prime factorizations. 


CHAPTER 13 


§ 10. The properties of geometric figures which remain unchanged under 
distortions (but not tearing) form the subject matter of one branch of mathe- 
matics, topology or analysis situs. Our investigation of the regular polyhedrons 
is purely topological. 

The idea of blowing the polyhedrons up into spheres represents a particular 
topological assumption. We will obtain entirely different results if we investigate 
polyhedrons that are topologically equivalent to a torus, and it will bring out 
the topological significance of §§ 7 to 10. It turns out that there are infinitely 
many ‘‘regular’’ maps (in the topological sense) on the surface of a torus, but that 
none can be realized as a regular polyhedron in the sense of metric geometry. 

On the torus, Euler’s formula becomes (see Chap. 12, § 5) 


(3*) v—e+f=0. 

Equations (4) and (5) obviously remain valid here, so we obtain, in place of (6), 
(6*) J (2p + 2 — ge) = 0, 

and then, since f+ 0, 

(7*) 2p + 2e — ye = 0, 

or 

(9*) (p — 2)(e— 2) = 4. 


Instead of the inequality (9) we now have an equality. The decomposition of 
4 into two factors gives the possibilities 1 - 4, 2- 2, and 4- 1 for (m — 2) - (e — 2). 
For » and ¢, this gives the table: 
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Y | 3 4 6 
€ | 6 = 3 
These values satisfy (9*) and therefore (7*). But then (6*) is satisfied by every 
number f, since f- 0 = 0 for all f Consequently, in this case, f, 2, and e cannot 
be found from the values of y and e. 
Indeed, there are infinitely many values of f, v, and e for each pair 9, «. 
For the pair y = 4, e = 4 we can choose any number a > | and arrange a? 
squares in the form of a square (Fig. 119). This square can be rolled up into a 
cylinder (Fig. 120) and the cylinder can be bent into a torus (Fig. 121). Then 
the torus is covered with a regular map in which each country has g = 4 boun- 


a 


Fig. 119 


ccacuceqaaaaQ 


Fig. 120 Fig. 121 


daries, and « = 4 countries touch at each vertex. Here we have f= a’, and 
it is easy to see that we also have v = a?, e = 2a*. Since a is any number greater 
than 1, we have an infinite number of topologically regular maps on the torus. 
(It is more difficult to construct the maps for y = 6, « = 3. The case f = 7, 
» = 14,¢ = 21 appears among these maps. This is the map mentioned in Chap. 12, 
§ 5, in connection with the coloring problem on the torus). 

None of these polyhedrons can be realized as regular polyhedrons in the sense 
of metric geometriy. For if y = 4, e = 4, then the faces must be squares. Four 
squares around a point lie flat in a plane and do not forma three-dimensional 
vertex. No matter how many squares are put together, they will continue to lie in 
a plane and will not form the surface of a solid body. The same is true for gy = 6, 
¢ = 3, since three regular hexagons around a point lie ina plane. Similarly, for 
g@ = 3, e = 6, six equilateral triangles around a point lie in a plane. 

In summary, we notice first that the existence of just five topologically regular 
polyhedrons on a spherical surface is a topological property of the sphere. There 
are infinitely many topologically regular polyhedrons on a surface of the type of 
the torus. Secondly, topological regularity may not imply metric regularity. 
The regular polyhedrons of the type of the torus cannot be realized as regular 
polyhedrons in the metric sense. If the five metrically regular polyhedrons on 
the sphere can be realized, it will be because of special metric properties of the 
sphere. 


CHAPTER 14 
§7. Leaving aside the case n = 4 (which is treated in § 8) it suffices to prove the 
impossibility of x" + y" = z" in integers for prime numbers n = p only. In the 
literature two cases are distinguished: case I, in which x + y - z is not divisible by p, 
case II, in which one (and only one) of the numbers x, y, z is divisible by p. In 
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case I Fermat’s conjecture has been proved for all prime numbers p < 253, 747, 
889 (by D. H. Lehmer, Emma Lehmer, and Rosser). In case II deeper theorems 
of number theory have to be used, and here Fermat’s conjecture has been verified 
for all  < 4001, with the use of the SWAG, an electronic digital computing 
machine at Los Angeles, by D. H. and Emma Lehmer and Vandiver. 


CHAPTER 16 


H. W.E. Jung, Jourual f.d. reine u. angew. Mathem., vol. 123, p. 241-257, 1901, 
investigates the analogous problem in n dimensions. He finds 


P= i/ es 
2(n + 1) 


as the radius of the spanning sphere for a finite set of points in n dimensions, 
with span d. The case n = 2 is the problem of the text. 


CHAPTER 18 
See Gabriel Koenigs, Legons de Cinématique, Paris 1897, p. 273-285. 


CHAPTER 19 


§ 7. The prime numbers of the form 2? — 1, where itself is a prime number, 
are called Mersenne primes (after Pére Marin Mersenne, a correspondent of 
Fermat and Descartes). Those known at present belong to 

p = 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127, 521, 607, 1279, 2203, 2281, 
of which the last five were found in 1952— 1953 by meané of the digital computer 
SWAC in Los Angeles. Each Mersenne prime produces a perfect number. 


CHAPTER 22 


J. Steiner, Werke, vol. II, p. 193-195. The completion of the proof indicated 
at the end of the chapter has been given by C. Carathéodory and E. Study, 
Math. Ann., vol. 68, p. 133-140. For a different, complete proof see Edler, 
Géttinger Nachrichten, 1882, p. 73. 

A remark in Simplikios’ commentary on Aristotle’s De Coelo, Berl. Ak. Ausg. 
VII, p. 412,53, states that Archimedes and Zenodorus had proved the theorem 
for both plane and solid figures. It also asserts that the theorem was known 
even before Aristotle’s time. 


CHAPTER 23 


Gauss, Disquisitiones Arithmeticae, art. 312-318, discusses periodic decimal frac- 
tions, but he makes use of many results from number theory. 

§ 4. The number y(n) depends upon n. It is frequently called Euler’s 
function, and may be defined as the number of reduced proper fractions of 
denominator n. 

§ 8. Ifp, 9, 7,+++ are the different prime numbers which divide n, then p(n) 
is given by the formula 

b-1 q-—-1 r—1 
p(n) = n+ Sh ew ee so ess 
p q r 
The proof of this formula can be found in any textbook of the theory of numbers, 
for example, Hardy and Wright, An Introduction to the Theory of Numbers, Oxford 
1938, p. 64, theorem 63. 
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CHAPTER 25 


§ 7. The property of a curve of constant breadth asserted by theorem IV 
also holds for all convex curves, and it can be proved directly. However, the proof 
makes use of limiting processes and is best carried out in its analytical formulation. 

The part of the theory of curves of constant breadth which can be handled 
by elementary methods is completely given in thischapter. There isa considerable 
body of literature concerning the more difficult problems of the theory. The 
reader who is familiar with integral and differential calculus may be referred 
to W. Blaschke, Kreis and Kugel, Leipzig, 1916. 


CHAPTER 26 


§ 1. Lorenzo Mascheroni (1750-1800) states, in the introduction to his book 
La geomeiria del compasso (Pavia, 1797), that he had studied the question of con- 
structions with the compass alone, originally for practical reasons. Actual 
constructions made with a compass are usually more accurate than those made 
with a straightedge, a fact, as Mascheroni knew, that is used by astronomers 
when they wish to produce accurate scales on their instruments. His thorough 
study of constructions with the compass led him to discover that all the con- 
structions of Euclidean geometry can be carried out with the compass alone. 

Mascheroni dedicated his book to Napoleon, praising him as the liberator 
of Northern Italy. In return, Napoleon brought the book to the attention of 
French scholars in a conversation (December 10, 1797) with members of the 
French Academy. In the French translation (Géométrie du Compas, 2nd edition, 
Paris, 1828) this conversation is described as follows: 

“Lagrange et Laplace faisaient partie de la réunion, et dans une conversation 
que Bonaparte eut avec ces illustres géométres, et particuli¢rement avec Laplace 
il leur fit connaitre la Géométrie du Compas, ouvrage alors tout nouveau et in- 
connu en France, en leur donnant la solution de quelques-uns des problémes 
qui se trouvent dans cette production originale. Aprés avoir écouté Bonaparte 
avec attention, Laplace, qui avait été son professeur de Mathématiques a l’école 
de Brienne, lui dit en présence de tous les savans réunis autour d’eux: ‘Nous 
attendions tout de vous, général, excepté des legons de Mathématiques’.”’ 

The only known work of the Danish mathematician Georg Mohr (1640-1697) 
is his Euclides Danicus (Amsterdam 1672), and this has only recently been redis- 
covered, (reprinted with German translation Copenhagen, 1928). The first part 
of this book is devoted to the problem of geometric constructions. 

Jacob Steiner (1796-1863), Swiss by birth, was a professor at the University 
of Berlin. The problem mentioned in this chapter is taken up in his book ‘‘Die 
geometrischen Konstructionen, ausgefiihrt mittelst der geraden Linie und eines 
festen Kreises, als Lehrgegenstand auf hoheren Unterrichts-Anstalten und zur 
praktischen Benutzung”’ (Berlin, 1833). 

The problem as to whether the center of a circle can be found by means of a 
straightedge alone, and the method of solution, go back to David Hilbert. 
It was first published by one of his students, Detlef Cauer (Math. Ann., vol. 73, 
p. 90-94, 1912, and vol. 74, p. 462-464, 1913). 

§§ 6 and 7. We have considered only the case of two non-intersecting circles. 
It would be impossible to find a projection of the type we have used for two inter- 
secting circles. In fact, such a projection would prove that the centers of the 
circles could not be found by means of constructions with the straightedge alone 
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but, as was mentioned in § 1, such a construction can be found in this case. 
The construction is really quite simple. In the first place, Fig. 122 shows 
how a diameter of a circle can be found if we are given two parallel chords AA’ 
and CC’. Because of symmetry, the line DD’ passes through the center of the 
circle. Ifa second pair of parallel chords is also known, then a second diameter 
can be drawn and the center is determined by the intersection of the two diameters. 
Now we must find a method of constructing two parallel chords. This is shown 
in Fig. 123. The line AA’ is any chord of one of the intersecting circles. Starting 
at A, we draw the line ASB and the line BS’C, thus determining the point C. 


Similarly, starting at A’ we draw A’S’B’ and then B’SC’. The points C and C’ 
determine another chord. To prove that AA’ and CC’ are parallel, we need only 
show that the alternate interior angles 1 and 6 are equal. Going through the 
angles 1, 2.--+ 6 in turn, we see that each is equal to the next, either because 
they intercept the same arc or because they are vertical angles. 

A construction to determine the centers of three non-intersecting circles is 
somewhat more complicated. We shall not reproduce it here, since it requires 
a more thorough knowledge of geometry. A construction may be found in 
Cauer’s paper mentioned above. 

§ 8. Other, shorter proofs of the theorem can be found, but they do not 
bring out the purely geometric aspects as well as the proof we have given. We 
chose this proof especially because it emphasizes the essential part of the theorem, 
the symmetry of the oblique cone. 


CHAPTER 27 


Bonse, Archiv. d. Math. u. Phys. (3), vol. 12, p. 292-295, 1907; cf. also R. Remak, 
Archiv d. Math. u. Phys. (3), vol. 15, p. 186-193, 1908. 

Actually, much more than p,,, < 2p, is now known. However, the best 
results are still not good enough to determine whether there is always a prime 
number between every pair of consecutive square numbers, for example, between 
100 and 121, between 121 and 144, and so on. 
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Excerpts from reviews of 


THE ENJOYMENT | 
OF MATHEMATICS 


“A superlative piece of mathematical writing 
to which it is not inappropriate to attach the 
adjective ‘classical’... . It will not only stretch 
the imagination of the amateur but it will also 
give pleasure to the sophisticated mathematician. 
It should prove extremely useful to teachers in 
quickening the mathematical spirit of students.” 
—American Mathematical Monthly 

“A book for dedicated amateur mathemati- 
cians. ... It takes the reader by easy stages into 
some rather advanced mathematical ideas and 
techniques. .. . There are plenty of mathemat- 
ical symbols and equations; nevertheless the 
exposition is not hard to follow. In fact the 
book is designed to be read without the aid of 
an instructor.”—Library Journal 

“From the practical point of view, it will 
supply supplementary material for the use of 
teachers, will illuminate certain areas of instruc- 
tion, and will interest a number of high school 
students who enjoy the subject. But it is more 
important from the recreational and aesthetic 
point of view. Any person who really likes 
mathematics may be assured of somewhere 
between twenty-eight hours and twenty-eight 
evenings of great pleasure with this remark- 


able volume.”—The Mathematics Teacher 
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Mathematics and Plausible Reasoning 
BY GEORGE POLYA. 


THIs is. guide to the practical art of plausible reasoning, particularly in mathematics 
but ‘also in‘every field of human activity. Using mathematics as the example par 
excellence, Professor Polya shows how even that most rigorous deductive discipline’ 
is heavily dependent on techniques of guessing, inductive reasoning, and reasoning 
by analogy. In solving a problem, the answer must be guessed at before a proof can 
even begin, and guesses are usually made from a knowledge of facts, experience, and 
hunches. This work might have been called “How to Become a Good Guesser.” 

Volumes I and II together make a coherent work on Mathematics and Plausible 
Reasoning. Volume I stands by itself as an essential book for anyone interested in 
mathematical reasoning. Volume II builds on the examples of Volume I but is not 
‘otherwise dependent on it. A more sophisticated reader with some mathematical 
experience will have no difficulty in reading Volume II independently, though he 
will probably want to read Volume I afterward. Professor Polya’s earlier more 
elementary book How to Solve It was closely related to Mathematics and Plausible 
Reasoning and furnishes some background for it. 

“Two outstanding features. . . . First, both volumes are written in a simple yet 
beguiling style, approaching the effectiveness of the spoken word. Second, an abun- 
dance of clever problems from a variety of fields stimulates the reader to keep up 
with the lively pace set by the author . . . should provide many entertaining hours 
for anyone who cares to pick up the challenge.”—Journal of the Franklin Institute. 
“_.. the most that the reviewer can hope to accomplish is to induce many readers 
of this review to discover for themselves the attractions of this stimulating and chal- 
lenging work, which, it is hoped, will‘in time profoundly influence the teaching of 


mathematics.” —QOuarterly of Applied Mathematics 


Vol. I. Induction and Analogy in Mathematics $5.50 
Vol. II. Patterns of Plausitble Inference $4.50 
Mathematics and Plausible Reasoning _ Vols. 1-& II, $9.00 


Order from your bookstore,-or~ 
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